Bug 896632

Summary: xosview +coretemp hangs
Product: [openSUSE] openSUSE 13.1 Reporter: Martin Schröder <martin>
Component: X11 ApplicationsAssignee: Dr. Werner Fink <werner>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: bwiedemann, chcao, coolo, jdelvare, martin, stefan.fent, werner
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 13.1   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: dmesg
Yet an other xosview binary with debug code in

Description Martin Schröder 2014-09-14 17:35:29 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:32.0) Gecko/20100101 Firefox/32.0

On my system with four cores, xosview +coretemp hangs - the programs runs, but shows no display.
Starting it with -coretemp works.

strace shows this:

stat("/sys/devices/platform/coretemp.0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
openat(AT_FDCWD, "/sys/devices/platform/coretemp.0", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
getdents(5, /* 19 entries */, 32768)    = 600
getdents(5, /* 0 entries */, 32768)     = 0
close(5)                                = 0
open("/sys/devices/platform/coretemp.0/temp2_label", O_RDONLY) = 5
read(5, "Core 0\n", 8191)               = 7
close(5)                                = 0
open("/sys/devices/platform/coretemp.0/temp4_label", O_RDONLY) = 5
read(5, "Core 2\n", 8191)               = 7
close(5)                                = 0
openat(AT_FDCWD, "/sys/class/hwmon", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
getdents(5, /* 4 entries */, 32768)     = 112
open("/sys/class/hwmon/hwmon0/device/name", O_RDONLY) = 6
read(6, "coretemp\n", 8191)             = 9
close(6)                                = 0
open("/sys/devices/platform/coretemp.0/temp0_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp1_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp2_label", O_RDONLY) = 6
read(6, "Core 0\n", 8191)               = 7
close(6)                                = 0
open("/sys/devices/platform/coretemp.0/temp3_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp4_label", O_RDONLY) = 6
read(6, "Core 2\n", 8191)               = 7
close(6)                                = 0
open("/sys/devices/platform/coretemp.0/temp5_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp6_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp7_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp8_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp9_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp10_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp11_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp12_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp13_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp14_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp15_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp16_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp17_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp18_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp19_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp20_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp21_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp22_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp23_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp24_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp25_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp26_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp27_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp28_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp29_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp30_label", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/sys/devices/platform/coretemp.0/temp31_label", O_RDONLY) = -1 ENOENT (No such file or directory)

and so forth: the number in /sys/devices/platform/coretemp.0/temp31_label is increased forever (or maybe till 2^32 or 2^64, but I didn't wait so long).

+coretemp works on two other systems (which have only two cores).

dmesg is attached.

I tested with xosview 1.16 from upstream, but the bug is still present there (and I could find no bugtracker).

Reproducible: Always

Steps to Reproduce:
1. Start xosview +coretemp
2.
3.
Actual Results:  
The display of xosview is not shown.

Expected Results:  
I want to see the display of xosview.
Comment 1 Martin Schröder 2014-09-14 17:36:11 UTC
Created attachment 606278 [details]
dmesg
Comment 2 Dr. Werner Fink 2014-09-15 09:58:54 UTC
show us the output of

    ls -l /sys/devices/platform/

and

    ls -l /sys/devices/platform/coretemp.0/

as this is what xosview also does before trying to read the informations in the files below e.g. /sys/devices/platform/coretemp.0/

Beside this, what das happen if you try to do e.g.

    for x in /sys/devices/platform/coretemp.0/temp*_label ; do
        car $x
    done

also show the result of

    lscpu
Comment 3 Dr. Werner Fink 2014-09-15 10:00:52 UTC
Not `car' but `cat' indeed
Comment 4 Martin Schröder 2014-09-15 10:56:32 UTC
(In reply to comment #2)
> show us the output of
> 
>     ls -l /sys/devices/platform/

drwxr-xr-x 3 root root    0 15. Sep 12:30 alarmtimer
drwxr-xr-x 4 root root    0 15. Sep 02:00 coretemp.0
drwxr-xr-x 4 root root    0 15. Sep 02:00 f71882fg.2560
drwxr-xr-x 4 root root    0 15. Sep 12:30 i8042
drwxr-xr-x 3 root root    0 15. Sep 12:30 microcode
drwxr-xr-x 4 root root    0 15. Sep 12:30 pcspkr
drwxr-xr-x 2 root root    0 15. Sep 12:53 power
drwxr-xr-x 4 root root    0 15. Sep 12:30 serial8250
-rw-r--r-- 1 root root 4096 15. Sep 12:53 uevent
drwxr-xr-x 4 root root    0 15. Sep 12:30 vesafb.0

> and
> 
>     ls -l /sys/devices/platform/coretemp.0/

lrwxrwxrwx 1 root root    0 15. Sep 12:30 driver -> ../../../bus/platform/drivers/coretemp
drwxr-xr-x 3 root root    0 15. Sep 02:00 hwmon
-r--r--r-- 1 root root 4096 15. Sep 12:53 modalias
-r--r--r-- 1 root root 4096 15. Sep 02:00 name
drwxr-xr-x 2 root root    0 15. Sep 12:53 power
lrwxrwxrwx 1 root root    0 15. Sep 02:00 subsystem -> ../../../bus/platform
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp2_crit
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp2_crit_alarm
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp2_input
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp2_label
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp2_max
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp4_crit
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp4_crit_alarm
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp4_input
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp4_label
-r--r--r-- 1 root root 4096 15. Sep 02:00 temp4_max
-rw-r--r-- 1 root root 4096 15. Sep 12:30 uevent

 
> as this is what xosview also does before trying to read the informations in the
> files below e.g. /sys/devices/platform/coretemp.0/
> 
> Beside this, what das happen if you try to do e.g.
> 
>     for x in /sys/devices/platform/coretemp.0/temp*_label ; do
>         car $x
>     done

Core 0
Core 2

> also show the result of
> 
>     lscpu

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 37
Model name:            Intel(R) Core(TM) i3 CPU         540  @ 3.07GHz
Stepping:              2
CPU MHz:               1200.000
BogoMIPS:              6117.98
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              4096K
NUMA node0 CPU(s):     0-3
Comment 5 Dr. Werner Fink 2014-09-15 11:40:39 UTC
Hmmmm ... in xosview wird via g++ der system call openat() mit O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC verwendet.  Woher kommen dann die vielen Unterdateien?  Wenn ich es hier starte, funktioniert es.
Comment 6 Dr. Werner Fink 2014-09-15 13:48:00 UTC
Created attachment 606371 [details]
Yet an other xosview binary with debug  code in

please save this as e.g. xosview.test and run

        ./xosview.test +coretemp

and report the output here
Comment 7 Martin Schröder 2014-09-15 13:55:03 UTC
(In reply to comment #6)
> and report the output here

0 <-> 2
0 <-> 2

and it still hangs.
Comment 8 Dr. Werner Fink 2014-09-15 14:24:06 UTC
The problem seems to be that the numbers of cores jumps away

> > 
> >     for x in /sys/devices/platform/coretemp.0/temp*_label ; do
> >         car $x
> >     done
> 
> Core 0
> Core 2

... never seen that and I gues the author of xosview also never does ... where are the Core 1.  Is this a change in the kernels API?  Had this ever happen before?
Comment 9 Martin Schröder 2014-09-15 14:26:57 UTC
(In reply to comment #8)
> The problem seems to be that the numbers of cores jumps away
> 
> > > 
> > >     for x in /sys/devices/platform/coretemp.0/temp*_label ; do
> > >         car $x
> > >     done
> > 
> > Core 0
> > Core 2
> 
> ... never seen that and I gues the author of xosview also never does ... where
> are the Core 1.  Is this a change in the kernels API?  Had this ever happen
> before?

Don't know; I've never tried coretemp before. But an i3 540 is not unusual...
Comment 10 Bernhard Wiedemann 2014-09-18 07:59:31 UTC
btw, I have here on a openSUSE-12.3
# for x in /sys/devices/platform/coretemp.0/temp*_label ; do cat $x ; done
Core 9
Core 10
Core 0
Core 1

on a
> processor       : 7
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 44
> model name      : Intel(R) Xeon(R) CPU           E5630  @ 2.53GHz

so I guess it is related to hyper-threading on Intel CPUs.

12.3's xosview does ignore +coretemps, but using the one from X11:Utilities 
I can reproduce the bug there
Comment 12 Jean Delvare 2014-09-23 20:06:44 UTC
Starting with "recent" Intel platforms, core numbering is no longer linear. Applications which expected core numbering linearity may break as a consequence - I think that's what is happening here. It's not a bug in the kernel, as the numbering is provided by the platform itself.

Libsensors is already fixed so monitoring applications which rely on libsensors are also fixed. xosview authors should seriously consider moving to libsensors to gather the information they need, in order to avoid that kind of problem in the future.
Comment 13 Bernhard Wiedemann 2014-10-30 14:57:54 UTC
Bug is also present in latest 1.16 from upstream.
http://www.pogo.org.uk/~mark/xosview/

However, it is hard to find a place to report this upstream
so maybe you want to do the fix or porting to libsensors?
Comment 14 Dr. Werner Fink 2014-12-18 08:10:32 UTC
Wanted : A new maintainer with background in /proc and /sys layout and C++ experience not only programming but also debugging.

Reason : I do not have the time even if I use xosview ar my own
Comment 15 Dr. Werner Fink 2015-02-03 13:46:12 UTC
Try latest xosview from X11:Utilities/xosview on openSUSE 13.2
Comment 16 Bernhard Wiedemann 2015-02-03 14:00:07 UTC
This is an autogenerated message for OBS integration:
This bug (896632) was mentioned in
https://build.opensuse.org/request/show/283896 Factory / xosview
Comment 17 Bernhard Wiedemann 2015-02-03 18:56:41 UTC
The new version no more hangs on start and shows a temperature for CPU0
Comment 18 Dr. Werner Fink 2015-02-04 07:41:26 UTC
(In reply to Bernhard Wiedemann from comment #17)

Maybe I should rename CPU0 to SOCKET0 as the manual page states:

       xosview*coretempNPackage:   0
              The number of physical CPU for meter N on Linux. Currently only
              one physical CPU can be shown per meter.

that is this physical socket may four cores with eight logical cpus
Comment 19 Martin Schröder 2015-02-04 09:31:28 UTC
(In reply to Dr. Werner Fink from comment #15)
> Try latest xosview from X11:Utilities/xosview on openSUSE 13.2

That shows me one bar (with CPU0) on my i3 540 (2 cores, 2HT cores).
Comment 20 Martin Schröder 2015-02-04 13:01:18 UTC
(In reply to Martin Schröder from comment #19)
> (In reply to Dr. Werner Fink from comment #15)
> > Try latest xosview from X11:Utilities/xosview on openSUSE 13.2
> 
> That shows me one bar (with CPU0) on my i3 540 (2 cores, 2HT cores).

Btw: sensors knowns about two cores:

coretemp-isa-0000
Adapter: ISA adapter
Core 0:       +54.0°C  (high = +89.0°C, crit = +105.0°C)
Core 2:       +59.0°C  (high = +89.0°C, crit = +105.0°C)
Comment 21 Dr. Werner Fink 2015-02-04 13:16:34 UTC
(In reply to Martin Schröder from comment #20)

Do you read the manual page? ... At least the paragraph I've pasted:

       xosview*coretempNPackage:   0
              The number of physical CPU for meter N on Linux. Currently only
              one physical CPU can be shown per meter.

I'll not hack this away.  The xosview simply reads the link `device' and if this pints to coretemp.* it reads temp*_input below /sys/class/hwmon/hwmon*/ and i guess you have exactly one coretemp.0 hence one line.

If you think this should be changed then you are free to fork https://github.com/hills/xosview
Comment 22 Martin Schröder 2015-02-04 14:23:18 UTC
(In reply to Werner Fink from comment #21)
> (In reply to Martin Schröder from comment #20)
> 
> Do you read the manual page? ... At least the paragraph I've pasted:
> 
>        xosview*coretempNPackage:   0
>               The number of physical CPU for meter N on Linux. Currently only
>               one physical CPU can be shown per meter.
> 
> I'll not hack this away.  The xosview simply reads the link `device' and if
> this pints to coretemp.* it reads temp*_input below /sys/class/hwmon/hwmon*/
> and i guess you have exactly one coretemp.0 hence one line.

Fair enough. But then the description should be changed: It's always the first core (my machine has only one physical CPU, but it has two cores).

Also, the documentation for xosview*coretempNDisplayType makes it sound like my machine should show two meters, so that option should probably also be removed.
Comment 23 Bernhard Wiedemann 2017-11-29 03:10:46 UTC
This is an autogenerated message for OBS integration:
This bug (896632) was mentioned in
https://build.opensuse.org/request/show/546335 15.0 / xosview