|
Bugzilla – Full Text Bug Listing |
| Summary: | GCC 8: xf86-video-intel build fails on i586 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Martin Liška <martin.liska> |
| Component: | X.Org | Assignee: | Michal Srb <msrb> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <xorg-maintainer-bugs> |
| Severity: | Normal | ||
| Priority: | P4 - Low | CC: | jan.hubicka, martin.liska, matz, msrb, mstaudt, ptesarik, rguenther, sndirsch |
| Version: | Current | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Bug Depends on: | |||
| Bug Blocks: | 1084649 | ||
| Attachments: | Fix build on i686 | ||
|
Description
Martin Liška
2018-05-09 11:27:46 UTC
If I read it correctly, gcc 8 does not want to inline `emit_vertex` into `vertex_emit_2s` because each use different target specific options.
Their full signatures are:
static __attribute__((always_inline)) void
vertex_emit_2s(struct sna *sna, int16_t x, int16_t y)
__attribute__((target("sse2,fpmath=sse"))) __attribute__((always_inline))
static void emit_vertex(/* omitted */)
Is it a new behavior that gcc 8 refuses to inline function into another if the target options differ? Or is the warning/error for failing always_inline new?
CC the GCC people. (In reply to Michal Srb from comment #1) > If I read it correctly, gcc 8 does not want to inline `emit_vertex` into > `vertex_emit_2s` because each use different target specific options. > > Their full signatures are: > > static __attribute__((always_inline)) void > vertex_emit_2s(struct sna *sna, int16_t x, int16_t y) > > __attribute__((target("sse2,fpmath=sse"))) __attribute__((always_inline)) > static void emit_vertex(/* omitted */) > > Is it a new behavior that gcc 8 refuses to inline function into another if > the target options differ? Or is the warning/error for failing always_inline > new? The specific situation is new but GCC always refuses to inline across incompatible target options. The new situation is that it considers functions without target attribute to be decorated with the effective target options of the compilation. (In reply to Richard Biener from comment #3) >[...] > The new situation is that it considers functions without > target attribute to be decorated with the effective target options of the > compilation. Such behaviour sounds reasonable for non-inline functions, but AFAICS it is not useful for inline functions, where appropriate target code is generated for each call site. In other words, how would you define the target of an inline function? Problem is with checking what functions are safe to inline across function flags change. It does make sense to inline plain C function, but for example you can't inline functions using SSE intrincinsc into function that has SSE disabled, which will eventually lead to a compiler crash. Validating all the possible combinations of options is very hard given the number of ISA extensions GCC supports for x86. We discussed this for a while on GCC IRC and Jakub Jelinek mentioned that Redhat already fixed similar issues in their distro. So perhaps we only want to bring patch over from there? Honza Created attachment 769764 [details] Fix build on i686 I found the patch from fedora. The description says: > The bug here appears to be that emit_vertex() is declared 'sse2' but > vertex_emit_2s is merely always_inline. gcc8 decides that since you said > always_inline you need to have explicitly cloned it for every > permutation of targets. Merely saying inline seems to do the job of > cloning vertex_emit_2s as much as necessary. > > So to reiterate: if you say always-inline, it won't, but if you just say > maybe inline, it will. Thanks gcc, that's helpful. > > - ajax I will submit it to our xf86-video-intel, to fix the problem for now. But it still looks to me like something is not working properly in GCC. (In reply to Michal Srb from comment #6) >[...] > > So to reiterate: if you say always-inline, it won't, but if you just say > > maybe inline, it will. Thanks gcc, that's helpful. Has anyone verified that the functions are indeed inlined in this case? I mean, if they are converted to regular functions (which gcc is definitely allowed to do, because "inline" is merely a hint), then we should also verify the effects on performance... (In reply to Jan Hubicka from comment #5) > Problem is with checking what functions are safe to inline across function > flags change. It does make sense to inline plain C function, but for > example you can't inline functions using SSE intrincinsc into function that > has SSE disabled, which will eventually lead to a compiler crash. Validating > all the possible combinations of options is very hard given the number of > ISA extensions GCC supports for x86. I think we can allow inlining a function with a strict ISA subset of the caller. The reasoning would be that the caller may be only entered if the CPU supports the ISA according to the target attribute. It might be tempting to aggressively IPA propagate ISA flags from caller to callees even if we do not see all callers (broadest ISA flags of callers win). But then maybe we'll run into situations where the call from say AVX to SSE is never executed at runtime. So for ISA flags I think we can "easily" make the x86 backend allow the inlining in question as it should be a matter of bitwise AND-ing the corresponding ISA flags? > We discussed this for a while on GCC IRC and Jakub Jelinek mentioned that > Redhat already fixed similar issues in their distro. So perhaps we only want > to bring patch over from there? > > Honza (In reply to Petr Tesařík from comment #7) > Has anyone verified that the functions are indeed inlined in this case? I checked it now and not all of the functions are inlined. For example the vertex_emit_2s is not inlined into emit_vertex. But the emit_vertex itself is inlined into all its call sites. The xf86-video-intel driver is not really well loved - to say the least. It is still shipped by distros as it is still faster in certain situations than the alternative solution. Knowing Ajax, I'd expect he wouldn't have settled with this solution easily if it wasn't this particular driver. A possible performance penalty may not even be seen as an all too unwelcome side effect ... (In reply to Richard Biener from comment #8) > (In reply to Jan Hubicka from comment #5) > > Problem is with checking what functions are safe to inline across function > > flags change. It does make sense to inline plain C function, but for > > example you can't inline functions using SSE intrincinsc into function that > > has SSE disabled, which will eventually lead to a compiler crash. Validating > > all the possible combinations of options is very hard given the number of > > ISA extensions GCC supports for x86. > > I think we can allow inlining a function with a strict ISA subset of the > caller. The reasoning would be that the caller may be only entered if > the CPU supports the ISA according to the target attribute. Looks like this is already the case but the issue here is the caller is using fpmath=sse but the callee is not. Unless the code uses -fexcess-precision=standard that isn't semantically equivalent. We're trying to allow some cases via else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath /* If the calle doesn't use FP expressions differences in ix86_fpmath can be ignored. We are called from FEs for multi-versioning call optimization, so beware of ipa_fn_summaries not available. */ && (! ipa_fn_summaries || ipa_fn_summaries->get (cgraph_node::get (callee))->fp_expressions)) ret = false; but I guess the callee does use FP here. I wonder whether the package uses any of the -ffast-math options and thus if we could relax this condition with -funsafe-math-optimizations or so. > It might be tempting to aggressively IPA propagate ISA flags from caller > to callees even if we do not see all callers (broadest ISA flags of callers > win). But then maybe we'll run into situations where the call from say > AVX to SSE is never executed at runtime. > > So for ISA flags I think we can "easily" make the x86 backend allow the > inlining in question as it should be a matter of bitwise AND-ing the > corresponding ISA flags? > > > We discussed this for a while on GCC IRC and Jakub Jelinek mentioned that > > Redhat already fixed similar issues in their distro. So perhaps we only want > > to bring patch over from there? > > > > Honza (In reply to Richard Biener from comment #3) > The new situation is that it considers functions without > target attribute to be decorated with the effective target options of the > compilation. This would explain the situation, but the gcc invocation does not have any target options. I tried to remove all unnecessary options and the error is still happening with gcc invocation looking like this (removed -D, -I and -W options for brevity): gcc -fno-strict-aliasing -fvisibility=hidden -O2 -g -c gen4_vertex.c -fPIC -o .libs/gen4_vertex.o (In reply to Richard Biener from comment #11) > I wonder whether the package > uses any of the -ffast-math options and thus if we could relax this > condition with -funsafe-math-optimizations or so. As far as I can see it does not use -ffast-math (as parameter or attribute). I tried adding either -fexcess-precision=standard or -funsafe-math-optimizations, but it still reports the same error. (In reply to Michal Srb from comment #12) > (In reply to Richard Biener from comment #3) > > The new situation is that it considers functions without > > target attribute to be decorated with the effective target options of the > > compilation. > > This would explain the situation, but the gcc invocation does not have any > target options. I tried to remove all unnecessary options and the error is > still happening with gcc invocation looking like this (removed -D, -I and -W > options for brevity): > > gcc -fno-strict-aliasing -fvisibility=hidden -O2 -g -c gen4_vertex.c > -fPIC -o > .libs/gen4_vertex.o > > (In reply to Richard Biener from comment #11) > > I wonder whether the package > > uses any of the -ffast-math options and thus if we could relax this > > condition with -funsafe-math-optimizations or so. > > As far as I can see it does not use -ffast-math (as parameter or attribute). > I tried adding either -fexcess-precision=standard or > -funsafe-math-optimizations, but it still reports the same error. The issue is the fpmath=sse in the target attribute and the default of fpmath=387 on the command-line (yes, default settings also count). That's a semantic difference (FP computation results differ due to precision). (In reply to Richard Biener from comment #13) > The issue is the fpmath=sse in the target attribute and the default of > fpmath=387 on the command-line (yes, default settings also count). That's > a semantic difference (FP computation results differ due to precision). Ah... so, even the lowest common denominator is not universally compatible. :-/ Anyway, are you suggesting that explicitly compiling this file with fpmath=sse would be a possible solution? (In reply to Petr Tesařík from comment #14) > Anyway, are you suggesting that explicitly compiling this file with > fpmath=sse would be a possible solution? I mean, with "-mfpmath=sse" on the command line, of course. (In reply to Petr Tesařík from comment #15) > (In reply to Petr Tesařík from comment #14) > > Anyway, are you suggesting that explicitly compiling this file with > > fpmath=sse would be a possible solution? > > I mean, with "-mfpmath=sse" on the command line, of course. No, you'll get cc1: warning: SSE instruction set disabled, using 387 arithmetics and the option will then have no effect (it sets -mfpmath=387 again). Any update about this? Is there a SR to Factory? (In reply to Martin Liška from comment #17) > Any update about this? Is there a SR to Factory? It seems that we won't find any better solution, so lets go with the Ajax's workaround and remove the always_inline. I have submitted it now to X11:XOrg devel project, it will go to Factory if accepted: https://build.opensuse.org/request/show/611553 I have limited the patch only to x86, as that is the only architecture that needs it. So there should be no negative effect on x86_64. This way majority (if not all) users won't be affected and we get the package building again. (In reply to Michal Srb from comment #18) > (In reply to Martin Liška from comment #17) > > Any update about this? Is there a SR to Factory? > > It seems that we won't find any better solution, so lets go with the Ajax's > workaround and remove the always_inline. > > I have submitted it now to X11:XOrg devel project, it will go to Factory if > accepted: > https://build.opensuse.org/request/show/611553 > > I have limited the patch only to x86, as that is the only architecture that > needs it. So there should be no negative effect on x86_64. This way majority > (if not all) users won't be affected and we get the package building again. Thanks for that! Forwarded submission to Factory: https://build.opensuse.org/request/show/611659 Closing the bug. |