Prev: [PATCH] keypad/nuc900: change keypad driver name from 'nuc900-keypad' to 'nuc900-kpi'
Next: [PATCH[RFC] Quirk macbook pro 6,2 into ahci mode
From: Denys Vlasenko on 18 Jul 2010 11:10 Hi Tim, Tim, folks, Update on -ffunction-sections status: I re-tested linux-2.6.35-rc4 today. Most of work needed for -ffunction-sections -fdata-sections is already in this kernel. This mail explains what is still missing. In order to have a working kernel with this make invocation: make KCFLAGS="-ffunction-sections -fdata-sections" linux-2.6.35-rc4 needs three patches: * modpost fix for 64k+ sections: linux-2.6.35-rc4-fs.modpost.patch This patch is in -mm, it still not reach mainline... * fix for kernel linker stripts: linux-2.6.35-rc4-fs.fix-kernel-linker-scripts.patch It makes _all_ linker scripts -ffunction/data-sections safe via: - *(.data) + *(.data .data.*) * fix for module linker script: linux-2.6.35-rc4-fs.fix-ko-module-linker-script.patch Prevents kernel modules from having unnecessarily many sections and thus prevents module size growth. Then, in order to also garbage-collect the sections, I added LDFLAGS_vmlinux += --gc-sections in top-level Makefile. This requires the additional patch (linux-2.6.35-rc4-fsgs.patch) which adds KEEP(section) directives to kernel linker stripts. Otherwise, linker will discard some crucial sections. All four patches are attached. I am sending this email from the machine which runs the kernel built with -ffunction-sections -fdata-sections --gc-sections. -- vda
From: Geert Uytterhoeven on 18 Jul 2010 14:30 On Sun, Jul 18, 2010 at 17:03, Denys Vlasenko <vda.linux(a)googlemail.com> wrote: > I am sending this email from the machine which runs the kernel > built with -ffunction-sections -fdata-sections --gc-sections. (sorry for the previous HTML-email) For the record, what are the metrics of this kernel vs. the standard one? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert(a)linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Denys Vlasenko on 18 Jul 2010 19:40 On Sunday 18 July 2010 20:24, Geert Uytterhoeven wrote: > On Sun, Jul 18, 2010 at 17:03, Denys Vlasenko <vda.linux(a)googlemail.com> wrote: > > I am sending this email from the machine which runs the kernel > > built with -ffunction-sections -fdata-sections --gc-sections. > > (sorry for the previous HTML-email) > > For the record, what are the metrics of this kernel vs. the standard one? Kernel: text data bss dec hex filename 8299299 857324 785348 9941971 97b3d3 linux-2.6.35-rc4.obj/vmlinux 7656461 841508 783700 9281669 8da085 linux-2.6.35-rc4-fs.obj/vmlinux 7566908 832388 717844 9117140 8b1dd4 linux-2.6.35-rc4-fsgs.obj/vmlinux The largest module in my build: text data bss dec hex filename 451009 54640 2224 507873 7bfe1 linux-2.6.35-rc4.obj/fs/xfs/xfs.ko 450519 54292 2202 507013 7bc85 linux-2.6.35-rc4-fs.obj/fs/xfs/xfs.ko 450521 54292 2202 507015 7bc87 linux-2.6.35-rc4-fsgs.obj/fs/xfs/xfs.ko -- vda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Sam Ravnborg on 23 Jul 2010 15:20 > > * modpost fix for 64k+ sections: linux-2.6.35-rc4-fs.modpost.patch > This patch is in -mm, it still not reach mainline... > Some comments below - but noting fundamental. Sam --- linux-2.6.35-rc4/scripts/mod/file2alias.c +++ linux-2.6.35-rc4-fs.obj1/scripts/mod/file2alias.c @@ -884,16 +884,16 @@ char *zeros = NULL; /* We're looking for a section relative symbol */ - if (!sym->st_shndx || sym->st_shndx >= info->hdr->e_shnum) + if (!sym->st_shndx || get_secindex(info, sym) >= info->num_sections) return; /* Handle all-NULL symbols allocated into .bss */ - if (info->sechdrs[sym->st_shndx].sh_type & SHT_NOBITS) { + if (info->sechdrs[get_secindex(info, sym)].sh_type & SHT_NOBITS) { zeros = calloc(1, sym->st_size); symval = zeros; } else { symval = (void *)info->hdr - + info->sechdrs[sym->st_shndx].sh_offset + + info->sechdrs[get_secindex(info, sym)].sh_offset + sym->st_value; } --- linux-2.6.35-rc4/scripts/mod/modpost.c +++ linux-2.6.35-rc4-fs.obj1/scripts/mod/modpost.c @@ -253,7 +253,7 @@ return export_unknown; } -static enum export export_from_sec(struct elf_info *elf, Elf_Section sec) +static enum export export_from_sec(struct elf_info *elf, unsigned int sec) { if (sec == elf->export_sec) return export_plain; @@ -373,6 +373,8 @@ Elf_Ehdr *hdr; Elf_Shdr *sechdrs; Elf_Sym *sym; + const char *secstrings; + unsigned int symtab_idx = ~0U, symtab_shndx_idx = ~0U; hdr = grab_file(filename, &info->size); if (!hdr) { @@ -417,8 +419,19 @@ return 0; } + /* Fixup for more than 64k sections */ + info->num_sections = hdr->e_shnum; + if (info->num_sections == 0) { /* more than 64k sections? */ + /* note: it doesn't need shndx2secindex() */ + info->num_sections = TO_NATIVE(sechdrs[0].sh_size); + } I had to read the above twice to get it. How about something like this: /* Fixup for more than 64k sections */ if (hdr->e_shnum == 0) { /* * There are more than 64k sections, * read count from .sh_size. * note: it doesn't need shndx2secindex() */ info->num_sections = TO_NATIVE(sechdrs[0].sh_size); } else { info->num_sections = hdr->e_shnum; } + info->secindex_strings = hdr->e_shstrndx; + if (info->secindex_strings == SHN_XINDEX) + info->secindex_strings = + shndx2secindex(TO_NATIVE(sechdrs[0].sh_link)); Likewise here... /* e_shstrndx == SHN_XINDEX if we have > 64k strings */ if (hdr->e_shstrndx != SHN_XINDEX) info->secindex_strings = hdr->e_shstrndx; else info->secindex_strings = shndx2secindex(TO_NATIVE(sechdrs[0].sh_link)); /* Fix endianness in section headers */ - for (i = 0; i < hdr->e_shnum; i++) { + for (i = 0; i < info->num_sections; i++) { sechdrs[i].sh_name = TO_NATIVE(sechdrs[i].sh_name); sechdrs[i].sh_type = TO_NATIVE(sechdrs[i].sh_type); sechdrs[i].sh_flags = TO_NATIVE(sechdrs[i].sh_flags); @@ -431,9 +444,8 @@ sechdrs[i].sh_entsize = TO_NATIVE(sechdrs[i].sh_entsize); } /* Find symbol table. */ - for (i = 1; i < hdr->e_shnum; i++) { - const char *secstrings - = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; + secstrings = (void *)hdr + sechdrs[info->secindex_strings].sh_offset; Moving this assignnet out of the loop is an unrelated but welcome change. + for (i = 1; i < info->num_sections; i++) { const char *secname; int nobits = sechdrs[i].sh_type == SHT_NOBITS; @@ -461,14 +473,26 @@ else if (strcmp(secname, "__ksymtab_gpl_future") == 0) info->export_gpl_future_sec = i; - if (sechdrs[i].sh_type != SHT_SYMTAB) - continue; + if (sechdrs[i].sh_type == SHT_SYMTAB) { + unsigned int sh_link_idx; + symtab_idx = i; + info->symtab_start = (void *)hdr + + sechdrs[i].sh_offset; + info->symtab_stop = (void *)hdr + + sechdrs[i].sh_offset + sechdrs[i].sh_size; + sh_link_idx = shndx2secindex(sechdrs[i].sh_link); + info->strtab = (void *)hdr + + sechdrs[sh_link_idx].sh_offset; + } - info->symtab_start = (void *)hdr + sechdrs[i].sh_offset; - info->symtab_stop = (void *)hdr + sechdrs[i].sh_offset - + sechdrs[i].sh_size; - info->strtab = (void *)hdr + - sechdrs[sechdrs[i].sh_link].sh_offset; + /* 32bit section no. table? ("more than 64k sections") */ + if (sechdrs[i].sh_type == SHT_SYMTAB_SHNDX) { + symtab_shndx_idx = i; + info->symtab_shndx_start = (void *)hdr + + sechdrs[i].sh_offset; + info->symtab_shndx_stop = (void *)hdr + + sechdrs[i].sh_offset + sechdrs[i].sh_size; + } } if (!info->symtab_start) fatal("%s has no symtab?\n", filename); @@ -480,6 +504,21 @@ sym->st_value = TO_NATIVE(sym->st_value); sym->st_size = TO_NATIVE(sym->st_size); } + + if (symtab_shndx_idx != ~0U) { + Elf32_Word *p; + if (symtab_idx != + shndx2secindex(sechdrs[symtab_shndx_idx].sh_link)) + fatal("%s: SYMTAB_SHNDX has bad sh_link: %u!=%u\n", + filename, + shndx2secindex(sechdrs[symtab_shndx_idx].sh_link), + symtab_idx); + /* Fix endianness */ + for (p = info->symtab_shndx_start; p < info->symtab_shndx_stop; + p++) + *p = TO_NATIVE(*p); + } + return 1; } @@ -514,7 +553,7 @@ Elf_Sym *sym, const char *symname) { unsigned int crc; - enum export export = export_from_sec(info, sym->st_shndx); + enum export export = export_from_sec(info, get_secindex(info, sym)); switch (sym->st_shndx) { case SHN_COMMON: @@ -656,19 +695,19 @@ return "(unknown)"; } -static const char *sec_name(struct elf_info *elf, int shndx) +static const char *sec_name(struct elf_info *elf, int secindex) { Elf_Shdr *sechdrs = elf->sechdrs; return (void *)elf->hdr + - elf->sechdrs[elf->hdr->e_shstrndx].sh_offset + - sechdrs[shndx].sh_name; + elf->sechdrs[elf->secindex_strings].sh_offset + + sechdrs[secindex].sh_name; } static const char *sech_name(struct elf_info *elf, Elf_Shdr *sechdr) { return (void *)elf->hdr + - elf->sechdrs[elf->hdr->e_shstrndx].sh_offset + - sechdr->sh_name; + elf->sechdrs[elf->secindex_strings].sh_offset + + sechdr->sh_name; } /* if sym is empty or point to a string @@ -1047,11 +1086,14 @@ Elf_Sym *near = NULL; Elf64_Sword distance = 20; Elf64_Sword d; + unsigned int relsym_secindex; if (relsym->st_name != 0) return relsym; + + relsym_secindex = get_secindex(elf, relsym); for (sym = elf->symtab_start; sym < elf->symtab_stop; sym++) { - if (sym->st_shndx != relsym->st_shndx) + if (get_secindex(elf, sym) != relsym_secindex) continue; if (ELF_ST_TYPE(sym->st_info) == STT_SECTION) continue; @@ -1113,9 +1155,9 @@ for (sym = elf->symtab_start; sym < elf->symtab_stop; sym++) { const char *symsec; - if (sym->st_shndx >= SHN_LORESERVE) + if (is_shndx_special(sym->st_shndx)) continue; - symsec = sec_name(elf, sym->st_shndx); + symsec = sec_name(elf, get_secindex(elf, sym)); if (strcmp(symsec, sec) != 0) continue; if (!is_valid_name(elf, sym)) @@ -1311,7 +1353,7 @@ const char *tosec; const struct sectioncheck *mismatch; - tosec = sec_name(elf, sym->st_shndx); + tosec = sec_name(elf, get_secindex(elf, sym)); mismatch = section_mismatch(fromsec, tosec); if (mismatch) { Elf_Sym *to; @@ -1339,7 +1381,7 @@ Elf_Shdr *sechdr, Elf_Rela *r) { Elf_Shdr *sechdrs = elf->sechdrs; - int section = sechdr->sh_info; + int section = shndx2secindex(sechdr->sh_info); return (void *)elf->hdr + sechdrs[section].sh_offset + r->r_offset - sechdrs[section].sh_addr; @@ -1447,7 +1489,7 @@ r.r_addend = TO_NATIVE(rela->r_addend); sym = elf->symtab_start + r_sym; /* Skip special sections */ - if (sym->st_shndx >= SHN_LORESERVE) + if (is_shndx_special(sym->st_shndx)) continue; check_section_mismatch(modname, elf, &r, sym, fromsec); } @@ -1505,7 +1547,7 @@ } sym = elf->symtab_start + r_sym; /* Skip special sections */ - if (sym->st_shndx >= SHN_LORESERVE) + if (is_shndx_special(sym->st_shndx)) continue; check_section_mismatch(modname, elf, &r, sym, fromsec); } @@ -1530,7 +1572,7 @@ Elf_Shdr *sechdrs = elf->sechdrs; /* Walk through all sections */ - for (i = 0; i < elf->hdr->e_shnum; i++) { + for (i = 0; i < elf->num_sections; i++) { check_section(modname, elf, &elf->sechdrs[i]); /* We want to process only relocation sections and not .init */ if (sechdrs[i].sh_type == SHT_RELA) --- linux-2.6.35-rc4/scripts/mod/modpost.h +++ linux-2.6.35-rc4-fs.obj1/scripts/mod/modpost.h @@ -129,7 +129,50 @@ const char *strtab; char *modinfo; unsigned int modinfo_len; + + /* support for 32bit section numbers */ + + unsigned int num_sections; /* max_secindex + 1 */ + unsigned int secindex_strings; + /* if Nth symbol table entry has .st_shndx = SHN_XINDEX, + * take shndx from symtab_shndx_start[N] instead */ + Elf32_Word *symtab_shndx_start; + Elf32_Word *symtab_shndx_stop; }; + +static inline int is_shndx_special(unsigned int i) +{ + return i != SHN_XINDEX && i >= SHN_LORESERVE && i <= SHN_HIRESERVE; +} + +/* shndx is in [0..SHN_LORESERVE) U (SHN_HIRESERVE, 0xfffffff], thus: + * shndx == 0 <=> sechdrs[0] + * ...... + * shndx == SHN_LORESERVE-1 <=> sechdrs[SHN_LORESERVE-1] + * shndx == SHN_HIRESERVE+1 <=> sechdrs[SHN_LORESERVE] + * shndx == SHN_HIRESERVE+2 <=> sechdrs[SHN_LORESERVE+1] + * ...... + * fyi: sym->st_shndx is uint16, SHN_LORESERVE = ff00, SHN_HIRESERVE = ffff, + * so basically we map 0000..feff -> 0000..feff + * ff00..ffff -> (you are a bad boy, dont do it) + * 10000..xxxx -> ff00..(xxxx-0x100) + */ +static inline unsigned int shndx2secindex(unsigned int i) +{ + if (i <= SHN_HIRESERVE) + return i; + return i - (SHN_HIRESERVE + 1 - SHN_LORESERVE); +} + +/* Accessor for sym->st_shndx, hides ugliness of "64k sections" */ +static inline unsigned int get_secindex(const struct elf_info *info, + const Elf_Sym *sym) +{ + if (sym->st_shndx != SHN_XINDEX) + return sym->st_shndx; + return shndx2secindex(info->symtab_shndx_start[sym - + info->symtab_start]); +} /* file2alias.c */ extern unsigned int cross_build; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Sam Ravnborg on 23 Jul 2010 16:30
> * fix for module linker script: linux-2.6.35-rc4-fs.fix-ko-module-linker-script.patch > Prevents kernel modules from having unnecessarily many > sections and thus prevents module size growth. Acked-by: Sam Ravnborg <sam(a)ravnborg.org> --- linux-2.6.35-rc4/scripts/module-common.lds +++ linux-2.6.35-rc4.new/scripts/module-common.lds @@ -3,6 +3,29 @@ * Archs are free to supply their own linker scripts. ld will * combine them automatically. */ + +/* .data.foo are generated by gcc itself with -fdata-sections, + * whereas double-dot sections (like .data..percpu) are generated + * by kernel's magic macros. + * + * Since this script does not specify what to do with double-dot sections, + * ld -r will coalesce all .data..foo input sections into one .data..foo + * output section, all .data..bar input sections into one .data..bar + * output section and so on. This is exactly what we want. + * + * Same goes for .text, .bss and .rodata. In case of .rodata, various + * .rodata.foo sections are generated by gcc even without -fdata-sections + */ + SECTIONS { + + /* Coalesce sections produced by gcc -ffunction-sections */ + .text 0 : AT(0) { *(.text .text.[A-Za-z0-9_$^]*) } + + /* Coalesce sections produced by gcc -fdata-sections */ + .rodata 0 : AT(0) { *(.rodata .rodata.[A-Za-z0-9_$^]*) } + .data 0 : AT(0) { *(.data .data.[A-Za-z0-9_$^]*) } + .bss 0 : AT(0) { *(.bss .bss.[A-Za-z0-9_$^]*) } + /DISCARD/ : { *(.discard) } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |