Bug 1092541

Summary: GCC 8: xf86-video-intel build fails on i586
Product: [openSUSE] openSUSE Tumbleweed Reporter: Martin Liška <martin.liska>
Component: X.OrgAssignee: Michal Srb <msrb>
Status: RESOLVED FIXED QA Contact: E-mail List <xorg-maintainer-bugs>
Severity: Normal    
Priority: P4 - Low CC: jan.hubicka, martin.liska, matz, msrb, mstaudt, ptesarik, rguenther, sndirsch
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on:    
Bug Blocks: 1084649    
Attachments: Fix build on i686

Comment 1 Michal Srb 2018-05-09 12:22:23 UTC
If I read it correctly, gcc 8 does not want to inline `emit_vertex` into `vertex_emit_2s` because each use different target specific options.

Their full signatures are:

  static __attribute__((always_inline)) void
  vertex_emit_2s(struct sna *sna, int16_t x, int16_t y)

  __attribute__((target("sse2,fpmath=sse"))) __attribute__((always_inline)) 
  static void emit_vertex(/* omitted */)

Is it a new behavior that gcc 8 refuses to inline function into another if the target options differ? Or is the warning/error for failing always_inline new?
Comment 2 Max Staudt 2018-05-09 13:02:08 UTC
CC the GCC people.
Comment 3 Richard Biener 2018-05-09 13:07:47 UTC
(In reply to Michal Srb from comment #1)
> If I read it correctly, gcc 8 does not want to inline `emit_vertex` into
> `vertex_emit_2s` because each use different target specific options.
> 
> Their full signatures are:
> 
>   static __attribute__((always_inline)) void
>   vertex_emit_2s(struct sna *sna, int16_t x, int16_t y)
> 
>   __attribute__((target("sse2,fpmath=sse"))) __attribute__((always_inline)) 
>   static void emit_vertex(/* omitted */)
> 
> Is it a new behavior that gcc 8 refuses to inline function into another if
> the target options differ? Or is the warning/error for failing always_inline
> new?

The specific situation is new but GCC always refuses to inline across incompatible
target options.  The new situation is that it considers functions without
target attribute to be decorated with the effective target options of the
compilation.
Comment 4 Petr Tesařík 2018-05-10 12:27:17 UTC
(In reply to Richard Biener from comment #3)
>[...]
>  The new situation is that it considers functions without
> target attribute to be decorated with the effective target options of the
> compilation.

Such behaviour sounds reasonable for non-inline functions, but AFAICS it is not useful for inline functions, where appropriate target code is generated for each call site. In other words, how would you define the target of an inline function?
Comment 5 Jan Hubicka 2018-05-10 12:42:05 UTC
Problem is with checking what functions are safe to inline across function flags change.  It does make sense to inline plain C function, but for example you can't inline functions using SSE intrincinsc into function that has SSE disabled, which will eventually lead to a compiler crash. Validating all the possible combinations of options is very hard given the number of ISA extensions GCC supports for x86.

We discussed this for a while on GCC IRC and Jakub Jelinek mentioned that Redhat already fixed similar issues in their distro. So perhaps we only want to bring patch over from there?

Honza
Comment 6 Michal Srb 2018-05-10 12:58:31 UTC
Created attachment 769764 [details]
Fix build on i686

I found the patch from fedora. The description says:

> The bug here appears to be that emit_vertex() is declared 'sse2' but
> vertex_emit_2s is merely always_inline. gcc8 decides that since you said
> always_inline you need to have explicitly cloned it for every
> permutation of targets. Merely saying inline seems to do the job of
> cloning vertex_emit_2s as much as necessary.
>
> So to reiterate: if you say always-inline, it won't, but if you just say
> maybe inline, it will. Thanks gcc, that's helpful.
>
> - ajax

I will submit it to our xf86-video-intel, to fix the problem for now. But it still looks to me like something is not working properly in GCC.
Comment 7 Petr Tesařík 2018-05-10 14:39:50 UTC
(In reply to Michal Srb from comment #6)
>[...]
> > So to reiterate: if you say always-inline, it won't, but if you just say
> > maybe inline, it will. Thanks gcc, that's helpful.

Has anyone verified that the functions are indeed inlined in this case? I mean, if they are converted to regular functions (which gcc is definitely allowed to do, because "inline" is merely a hint), then we should also verify the effects on performance...
Comment 8 Richard Biener 2018-05-10 15:13:21 UTC
(In reply to Jan Hubicka from comment #5)
> Problem is with checking what functions are safe to inline across function
> flags change.  It does make sense to inline plain C function, but for
> example you can't inline functions using SSE intrincinsc into function that
> has SSE disabled, which will eventually lead to a compiler crash. Validating
> all the possible combinations of options is very hard given the number of
> ISA extensions GCC supports for x86.

I think we can allow inlining a function with a strict ISA subset of the
caller.  The reasoning would be that the caller may be only entered if
the CPU supports the ISA according to the target attribute.

It might be tempting to aggressively IPA propagate ISA flags from caller
to callees even if we do not see all callers (broadest ISA flags of callers
win).  But then maybe we'll run into situations where the call from say
AVX to SSE is never executed at runtime.

So for ISA flags I think we can "easily" make the x86 backend allow the
inlining in question as it should be a matter of bitwise AND-ing the
corresponding ISA flags?

> We discussed this for a while on GCC IRC and Jakub Jelinek mentioned that
> Redhat already fixed similar issues in their distro. So perhaps we only want
> to bring patch over from there?
> 
> Honza
Comment 9 Michal Srb 2018-05-10 15:17:01 UTC
(In reply to Petr Tesařík from comment #7)
> Has anyone verified that the functions are indeed inlined in this case?

I checked it now and not all of the functions are inlined. For example the vertex_emit_2s is not inlined into emit_vertex. But the emit_vertex itself is inlined into all its call sites.
Comment 10 Egbert Eich 2018-05-10 17:13:10 UTC
The xf86-video-intel driver is not really well loved - to say the least. It is still shipped by distros as it is still faster in certain situations than the alternative solution.
Knowing Ajax, I'd expect he wouldn't have settled with this solution easily if it wasn't this particular driver.

A possible performance penalty may not even be seen as an all too unwelcome side effect ...
Comment 11 Richard Biener 2018-05-11 07:13:44 UTC
(In reply to Richard Biener from comment #8)
> (In reply to Jan Hubicka from comment #5)
> > Problem is with checking what functions are safe to inline across function
> > flags change.  It does make sense to inline plain C function, but for
> > example you can't inline functions using SSE intrincinsc into function that
> > has SSE disabled, which will eventually lead to a compiler crash. Validating
> > all the possible combinations of options is very hard given the number of
> > ISA extensions GCC supports for x86.
> 
> I think we can allow inlining a function with a strict ISA subset of the
> caller.  The reasoning would be that the caller may be only entered if
> the CPU supports the ISA according to the target attribute.

Looks like this is already the case but the issue here is the caller
is using fpmath=sse but the callee is not.  Unless the code uses
-fexcess-precision=standard that isn't semantically equivalent.  We're
trying to allow some cases via

  else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath
           /* If the calle doesn't use FP expressions differences in
              ix86_fpmath can be ignored.  We are called from FEs
              for multi-versioning call optimization, so beware of
              ipa_fn_summaries not available.  */
           && (! ipa_fn_summaries
               || ipa_fn_summaries->get
               (cgraph_node::get (callee))->fp_expressions))
    ret = false;

but I guess the callee does use FP here.  I wonder whether the package
uses any of the -ffast-math options and thus if we could relax this
condition with -funsafe-math-optimizations or so.

> It might be tempting to aggressively IPA propagate ISA flags from caller
> to callees even if we do not see all callers (broadest ISA flags of callers
> win).  But then maybe we'll run into situations where the call from say
> AVX to SSE is never executed at runtime.
> 
> So for ISA flags I think we can "easily" make the x86 backend allow the
> inlining in question as it should be a matter of bitwise AND-ing the
> corresponding ISA flags?
> 
> > We discussed this for a while on GCC IRC and Jakub Jelinek mentioned that
> > Redhat already fixed similar issues in their distro. So perhaps we only want
> > to bring patch over from there?
> > 
> > Honza
Comment 12 Michal Srb 2018-05-15 14:22:50 UTC
(In reply to Richard Biener from comment #3)
> The new situation is that it considers functions without
> target attribute to be decorated with the effective target options of the
> compilation.

This would explain the situation, but the gcc invocation does not have any target options. I tried to remove all unnecessary options and the error is still happening with gcc invocation looking like this (removed -D, -I and -W options for brevity):

    gcc -fno-strict-aliasing -fvisibility=hidden -O2 -g -c gen4_vertex.c -fPIC -o
    .libs/gen4_vertex.o

(In reply to Richard Biener from comment #11)
> I wonder whether the package
> uses any of the -ffast-math options and thus if we could relax this
> condition with -funsafe-math-optimizations or so.

As far as I can see it does not use -ffast-math (as parameter or attribute). I tried adding either -fexcess-precision=standard or -funsafe-math-optimizations, but it still reports the same error.
Comment 13 Richard Biener 2018-05-16 07:10:16 UTC
(In reply to Michal Srb from comment #12)
> (In reply to Richard Biener from comment #3)
> > The new situation is that it considers functions without
> > target attribute to be decorated with the effective target options of the
> > compilation.
> 
> This would explain the situation, but the gcc invocation does not have any
> target options. I tried to remove all unnecessary options and the error is
> still happening with gcc invocation looking like this (removed -D, -I and -W
> options for brevity):
> 
>     gcc -fno-strict-aliasing -fvisibility=hidden -O2 -g -c gen4_vertex.c
> -fPIC -o
>     .libs/gen4_vertex.o
> 
> (In reply to Richard Biener from comment #11)
> > I wonder whether the package
> > uses any of the -ffast-math options and thus if we could relax this
> > condition with -funsafe-math-optimizations or so.
> 
> As far as I can see it does not use -ffast-math (as parameter or attribute).
> I tried adding either -fexcess-precision=standard or
> -funsafe-math-optimizations, but it still reports the same error.

The issue is the fpmath=sse in the target attribute and the default of
fpmath=387 on the command-line (yes, default settings also count).  That's
a semantic difference (FP computation results differ due to precision).
Comment 14 Petr Tesařík 2018-05-16 08:08:33 UTC
(In reply to Richard Biener from comment #13)
> The issue is the fpmath=sse in the target attribute and the default of
> fpmath=387 on the command-line (yes, default settings also count).  That's
> a semantic difference (FP computation results differ due to precision).

Ah... so, even the lowest common denominator is not universally compatible. :-/

Anyway, are you suggesting that explicitly compiling this file with fpmath=sse would be a possible solution?
Comment 15 Petr Tesařík 2018-05-16 08:10:13 UTC
(In reply to Petr Tesařík from comment #14)
> Anyway, are you suggesting that explicitly compiling this file with
> fpmath=sse would be a possible solution?

I mean, with "-mfpmath=sse" on the command line, of course.
Comment 16 Richard Biener 2018-05-16 08:25:52 UTC
(In reply to Petr Tesařík from comment #15)
> (In reply to Petr Tesařík from comment #14)
> > Anyway, are you suggesting that explicitly compiling this file with
> > fpmath=sse would be a possible solution?
> 
> I mean, with "-mfpmath=sse" on the command line, of course.

No, you'll get

cc1: warning: SSE instruction set disabled, using 387 arithmetics

and the option will then have no effect (it sets -mfpmath=387 again).
Comment 17 Martin Liška 2018-05-23 12:39:20 UTC
Any update about this? Is there a SR to Factory?
Comment 18 Michal Srb 2018-05-23 12:54:45 UTC
(In reply to Martin Liška from comment #17)
> Any update about this? Is there a SR to Factory?

It seems that we won't find any better solution, so lets go with the Ajax's workaround and remove the always_inline.

I have submitted it now to X11:XOrg devel project, it will go to Factory if accepted:
https://build.opensuse.org/request/show/611553

I have limited the patch only to x86, as that is the only architecture that needs it. So there should be no negative effect on x86_64. This way majority (if not all) users won't be affected and we get the package building again.
Comment 19 Martin Liška 2018-05-23 13:01:21 UTC
(In reply to Michal Srb from comment #18)
> (In reply to Martin Liška from comment #17)
> > Any update about this? Is there a SR to Factory?
> 
> It seems that we won't find any better solution, so lets go with the Ajax's
> workaround and remove the always_inline.
> 
> I have submitted it now to X11:XOrg devel project, it will go to Factory if
> accepted:
> https://build.opensuse.org/request/show/611553
> 
> I have limited the patch only to x86, as that is the only architecture that
> needs it. So there should be no negative effect on x86_64. This way majority
> (if not all) users won't be affected and we get the package building again.

Thanks for that!
Comment 20 Michal Srb 2018-05-23 14:01:00 UTC
Forwarded submission to Factory:
https://build.opensuse.org/request/show/611659

Closing the bug.