Bug 672450 - sysvinit+systemd: too many repeated messages on ttyS0 console
Summary: sysvinit+systemd: too many repeated messages on ttyS0 console
Status: RESOLVED DUPLICATE of bug 642289
Alias: None
Product: openSUSE 11.4
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: Factory
Hardware: All Other
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: Dr. Werner Fink
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-16 09:39 UTC by Jiri Slaby
Modified: 2011-07-26 13:00 UTC (History)
5 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
screenlog (3.42 MB, text/plain)
2011-02-16 09:39 UTC, Jiri Slaby
Details
systemd-screenlog (185.87 KB, text/plain)
2011-02-16 10:05 UTC, Jiri Slaby
Details
dmesg from kernel with vfs_write intercepted (78.97 KB, text/plain)
2011-02-16 13:08 UTC, Jiri Slaby
Details
dmesg from kernel with vfs_write intercepted updated (91.41 KB, text/plain)
2011-02-16 13:32 UTC, Jiri Slaby
Details
read+write syscalls with injected code (62.58 KB, text/plain)
2011-02-17 12:45 UTC, Jiri Slaby
Details
/sbin/blogd for 11.4-x86_64 (94.60 KB, application/octet-stream)
2011-02-18 10:24 UTC, Dr. Werner Fink
Details
screenlog with updated blogd (28.67 KB, text/plain)
2011-02-21 09:08 UTC, Jiri Slaby
Details
screenlog with updated blogd (199.04 KB, text/plain)
2011-02-22 21:21 UTC, Jiri Slaby
Details
/sbin/blogd for 11.4-x86_64 (39.15 KB, application/octet-stream)
2011-03-02 13:53 UTC, Dr. Werner Fink
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jiri Slaby 2011-02-16 09:39:20 UTC
Created attachment 414366 [details]
screenlog

Systemd repeats some messages thousand times while the system is booting. Screenlog attached.

For example:
...
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
^MBoot logging started on /dev/ttyS0(/dev/console) at Wed Feb 16 10:29:08 2011
...
Comment 1 Jiri Slaby 2011-02-16 09:48:50 UTC
Ah, sorry I booted with sysvinit this time. So that's its fault.
Comment 2 Jiri Slaby 2011-02-16 10:05:04 UTC
Created attachment 414379 [details]
systemd-screenlog

(In reply to comment #1)
> Ah, sorry I booted with sysvinit this time. So that's its fault.

Hmm, not really, happens also with systemd. But not that much. Only some messages are repeated.
Comment 3 Jiri Slaby 2011-02-16 10:08:41 UTC
I don't know whose fault is this then...

Cmdline:
BOOT_IMAGE=/boot/vmlinuz-2.6.37-22-default root=UUID=dceb6665-da77-45bd-
9744-06dfaa3811ea 3 console=tty0 console=ttyS0,115200
Comment 4 Dr. Werner Fink 2011-02-16 10:21:02 UTC
Hmmm .... I'm pretty sure that the blogd does send its messages only once.
You may try to disable the blogd by adding the key word `quiet' to the
kernels command line
Comment 5 Dr. Werner Fink 2011-02-16 10:29:16 UTC
The safe_out() routine of blogd retries to write out messages if
and only if the system call write(2) does return -1 and the
errno is set to  EINTR or EAGAIN or EIO ... for EIO a reconnect
to the console is tried ... for EPIPE blogd simply exits.

The question is: does the kernel set the return value to less than
zero and errno to EINTR or  even if write(2) had written out its buffer?
Comment 6 Dr. Werner Fink 2011-02-16 10:32:10 UTC
Adding maintainer of mkinitrd to carbon copy list. Hi Michal!
Comment 7 Dr. Werner Fink 2011-02-16 10:49:18 UTC
Jiri? You may remove the files

     /lib/mkinitrd/scripts/boot-blogd.sh
     /lib/mkinitrd/scripts/boot-killblogd.sh
     /lib/mkinitrd/scripts/boot-killblogd2.sh

and change within

     /lib/mkinitrd/scripts/boot-start.s

the line

     exec < /dev/console > /dev/console 2>&1

to

     exec < /dev/console > /dev/kmsg 2>&1

to use the kernels message ring buffer instead the ring buffer
of the blogd.  OK we loose the escape sequence filter as well
as the filter for the rotator/progress sequences of fsck.

... Nevertheless the serial driver in the kernel should not fool
the usage of the system call write(2).
Comment 8 Dr. Werner Fink 2011-02-16 11:28:44 UTC
To make this active the command

   mkinitrd

should be executed otherwise the initrd will not be changed :)
Comment 9 Jiri Slaby 2011-02-16 13:08:22 UTC
Created attachment 414407 [details]
dmesg from kernel with vfs_write intercepted

(In reply to comment #7)
> Jiri? You may remove the files
> 
>      /lib/mkinitrd/scripts/boot-blogd.sh
>      /lib/mkinitrd/scripts/boot-killblogd.sh
>      /lib/mkinitrd/scripts/boot-killblogd2.sh
> 
> and change within
> 
>      /lib/mkinitrd/scripts/boot-start.s
> 
> the line
> 
>      exec < /dev/console > /dev/console 2>&1
> 
> to
> 
>      exec < /dev/console > /dev/kmsg 2>&1

Yes, this stops blogd to flood the console.

> ... Nevertheless the serial driver in the kernel should not fool
> the usage of the system call write(2).

Nope, it's blogd who cripples. With this change (write(2) interception in the kernel):
diff --git a/fs/read_write.c b/fs/read_write.c
index 5520f8a..ecae48a 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -384,6 +384,14 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_
                inc_syscw(current);
        }
 
+       if (!strcmp(current->comm, "blogd")) {
+               int a;
+               printk("%s: cnt=%lu ret=%ld str=", __func__, count, ret);
+               for (a = 0; a < min(count, 16LU); a++)
+                       printk(KERN_CONT "%c", buf[a] == '\n' ? 'X' : buf[a]);
+               printk(KERN_CONT "\n");
+       }
+
        return ret;
 }
 
The output looks like (i.e. blogd write the same messages several times even though the retval equals to what it wanted kernel to write...):
[    2.612154] vfs_write: cnt=78 ret=78 str=^MBoot logging st
[    2.616610] vfs_write: cnt=78 ret=78 str=^MBoot logging st
[    2.621521] vfs_write: cnt=5 ret=5 str=1095X
[    2.634171] vfs_write: cnt=118 ret=118 str=^MBoot logging st
[    2.640975] vfs_write: cnt=118 ret=118 str=^MBoot logging st
[    2.646736] vfs_write: cnt=118 ret=118 str=^MBoot logging st
Comment 10 Jiri Slaby 2011-02-16 13:32:10 UTC
Created attachment 414412 [details]
dmesg from kernel with vfs_write intercepted updated

Well, I print also fd and open file. It makes more sense:
fd=1 (/dev/console) cnt=78 ret=78 redirected_tty_write str=^MBoot logging st
fd=5 (/var/run/blogd.pid) cnt=5 ret=5 do_sync_write str=1092X
fd=1 (/dev/ttyS0) cnt=118 ret=118 tty_write str=^MBoot logging st 
fd=7 (/dev/tty0) cnt=118 ret=118 redirected_tty_write str=^MBoot logging st
fd=4 (/dev/shm/initrd.msg) cnt=118 ret=118 do_sync_write str=^MBoot logging st
fd=1 (/dev/ttyS0) cnt=118 ret=118 tty_write str=^MBoot logging st
fd=7 (/dev/tty0) cnt=118 ret=118 redirected_tty_write str=^MBoot logging st
fd=1 (/dev/ttyS0) cnt=118 ret=118 tty_write str=^MBoot logging st 
fd=7 (/dev/tty0) cnt=118 ret=118 redirected_tty_write str=^MBoot logging st
fd=1 (/dev/ttyS0) cnt=118 ret=118 tty_write str=^MBoot logging st 
fd=7 (/dev/tty0) cnt=118 ret=118 redirected_tty_write str=^MBoot logging st
fd=1 (/dev/ttyS0) cnt=118 ret=118 tty_write str=^MBoot logging st
Comment 11 Dr. Werner Fink 2011-02-16 14:00:26 UTC
Hmmm ... I'm a bit puzzled as I do not see how this should happen.
The blogd writes out each message to the physical devices only once.
This would lead to the assumption that those messages will be
read by blogd from the new /dev/console (a tty from pty/tty pair)
again and again?
Comment 12 Dr. Werner Fink 2011-02-16 14:20:25 UTC
Which kernel version is used and does this happen with a prior version (2.6.37)?
Comment 13 Dr. Werner Fink 2011-02-16 14:29:24 UTC
I can follow the lines

fd=1 (/dev/console) cnt=78 ret=78 redirected_tty_write str=^MBoot logging st
fd=5 (/var/run/blogd.pid) cnt=5 ret=5 do_sync_write str=1092X
fd=1 (/dev/ttyS0) cnt=118 ret=118 tty_write str=^MBoot logging st 
fd=7 (/dev/tty0) cnt=118 ret=118 redirected_tty_write str=^MBoot logging st
fd=4 (/dev/shm/initrd.msg) cnt=118 ret=118 do_sync_write str=^MBoot logging st

but after this the code of blogd does not show how this should go on:

            parselog(trans, cnt);         /* Parse and make copy of the input */

            for (c = cons; c; c = c->next) {
                if (c->fd < 0)
                    continue;
                safeout(c->fd, (char*)trans, cnt, c->max_canon);
                (void)tcdrain(c->fd);     /* Write copy of input to real tty */
            }

            flushlog();

the file descriptor fd=4 is used by the thread triggered by flushlog()
this shows that this is no an endless loop, that is we have two physical
devices with file descriptors fd=1 and fd=7.

Please could you determine if the blogd rereads its own input even from
the new /dev/console?
Comment 14 Dr. Werner Fink 2011-02-16 14:31:17 UTC
In other words has there been cahnges at this area in the kernel?
Comment 15 Dr. Werner Fink 2011-02-17 12:30:00 UTC
The code can not do this ... even if the blogd function safein() would be
fooled by ioctl FIONREAD or by select(), the outer loop would require that
the log file of file descriptor 4 would be hit after fd 1 and 7 as the
character buffer trans[] would be filled with the input from /dev/console.

Adding gcc maintainer as he may now what is going on here
Comment 16 Jiri Slaby 2011-02-17 12:39:24 UTC
(In reply to comment #12)
> Which kernel version is used and does this happen with a prior version
> (2.6.37)?

This is my testing kernel where I can very easily inject code. Yes, it happens with 11.4 2.6.37 kernel.

(In reply to comment #14)
> In other words has there been cahnges at this area in the kernel?

It might be possible.
Comment 17 Jiri Slaby 2011-02-17 12:45:03 UTC
Created attachment 414691 [details]
read+write syscalls with injected code

$ grep preping.91 dmesg|sed 's/.*sys_//'
read: fd=0 (/dev/ptmx) cnt=1103 ret=1103 tty_read str=preping 91-killu
write: fd=1 (/dev/ttyS0) cnt=255 ret=255 tty_write str=preping 91-killu
write: fd=4 (/dev/shm/initrd.msg) cnt=1103 ret=1103 do_sync_write str=preping 91-killu
write: fd=7 (/dev/tty0) cnt=255 ret=255 redirected_tty_write str=preping 91-killu
read: fd=0 (/dev/ptmx) cnt=1103 ret=1103 tty_read str=preping 91-killu
write: fd=1 (/dev/ttyS0) cnt=255 ret=255 tty_write str=preping 91-killu
write: fd=4 (/dev/shm/initrd.msg) cnt=1103 ret=1103 do_sync_write str=preping 91-killu
write: fd=7 (/dev/tty0) cnt=255 ret=255 redirected_tty_write str=preping 91-killu
read: fd=0 (/dev/ptmx) cnt=1103 ret=1103 tty_read str=preping 91-killu
write: fd=1 (/dev/ttyS0) cnt=255 ret=255 tty_write str=preping 91-killu
write: fd=4 (/dev/shm/initrd.msg) cnt=1103 ret=1103 do_sync_write str=preping 91-killu
write: fd=7 (/dev/tty0) cnt=255 ret=255 redirected_tty_write str=preping 91-killu


It looks like (for example) this message is read from /dev/ptmx multiple times.
Comment 18 Dr. Werner Fink 2011-02-17 12:58:53 UTC
(In reply to comment #16)

My guess was that the ioctl FIONREAD and/or the system call select(2)
could be fooled, but then the read(2) system routine would fill the
transfer buffer with the same line and this should written out successively
in all three file descriptors 1, 7, *and* 4.

But your log shows me that only the two file descriptors 1 and 7 will be used.
And as the buffer will be filled with new informations the for-loop on the list
of consoles is not an endless loop. That together with the chaning content
of the transer buffer (beside the first line) had lead me to the assumption
that something goes wrong during compile time (gcc).

Maybe this is a problem with my used buffer

 static unsigned char trans[TRANS_BUFFER_SIZE];

and the cast used below

 const ssize_t cnt = safein(fdread, (char*)trans, sizeof(trans), noerr);

 [...]

             parselog(trans, cnt);

             for (c = cons; c; c = c->next) {
                 if (c->fd < 0)
                     continue;
                 safeout(c->fd, (char*)trans, cnt, c->max_canon);
                 (void)tcdrain(c->fd);
             }

             flushlog();

  [...]

this is the reason why I've put Michael into the carbon copy list.
Comment 19 Dr. Werner Fink 2011-02-17 13:03:46 UTC
(In reply to comment #17)

Why does it happen that read(2) can read the same string several
times from the master of the slave pty used as system console?

This is new in comparison to older kernels.
Comment 20 Dr. Werner Fink 2011-02-17 13:14:25 UTC
Hmmm ... during blogd the original console devies are switched into
raw tty mode whereas the new pty as temporary system console(s) are
switched into canonical mode.
Comment 21 Michael Matz 2011-02-17 13:25:48 UTC
(FWIW, right now I'm assuming this has nothing to do with GCC.  It seems
clear to me that the cause is that messages written to /dev/console are
sometimes reflected back to /dev/ptmx by the kernel, and that that is
the cause for blogd sometimes repeating the same message)
Comment 22 Dr. Werner Fink 2011-02-17 13:36:57 UTC
OK, I'll remove you from CC

Jiri, please tell me your archtitecture, as I'd like to attach
a blogd without switching any terminal into raw mode.
Comment 23 Jiri Slaby 2011-02-17 13:42:23 UTC
(In reply to comment #22)
> OK, I'll remove you from CC
> 
> Jiri, please tell me your archtitecture, as I'd like to attach
> a blogd without switching any terminal into raw mode.

x86_64

It's like it opens /dev/ptmx for reading messages. It writes the messages to /dev/console which is /dev/ptm0 which gets to /dev/ptmx:
tty_insert_flip_string_fixed_flag: blogd idx=0 name=ptm0 str=^MBoot logging st
sys_write: fd=1 (/dev/console) cnt=78 ret=78 redirected_tty_write str=^MBoot logging st
sys_read: fd=0 file=ffff880039d57000 (/dev/ptmx) cnt=78 ret=78 tty_read str=^MBoot logging st
sys_write: fd=1 (/dev/ttyS0) cnt=78 ret=78 tty_write str=^MBoot logging st
tty_insert_flip_string_fixed_flag: blogd idx=0 name=ptm0 str=^MBoot logging st
sys_write: fd=7 (/dev/tty0) cnt=78 ret=78 redirected_tty_write str=^MBoot logging st
sys_read: fd=0 file=ffff880039d57000 (/dev/ptmx) cnt=118 ret=118 tty_read str=sys_write: fd=4 (/dev/shm/initrd.msg) cnt=78 ret=78 do_sync_write str=XBoot logging st
preping 11-block
Comment 24 Dr. Werner Fink 2011-02-18 10:24:40 UTC
Created attachment 414956 [details]
/sbin/blogd for 11.4-x86_64

Please install this executable binary as /sbin/blogd and
run mkinitrd to become an updated binary as well in the initrd.
Thank you for testing!
Comment 25 Jiri Slaby 2011-02-21 09:08:19 UTC
Created attachment 415233 [details]
screenlog with updated blogd

(In reply to comment #24)
> Please install this executable binary as /sbin/blogd and

It doesn't spam the output. But it doesn't log anything either. There is no "Boot log started" etc. on the console...
Comment 26 Dr. Werner Fink 2011-02-21 11:07:41 UTC
(In reply to comment #25)

Hmmm ... that is strange, are you sure that you have removed the
workaround mentioned in comment #7 ... just run

     lsinitrd /boot/initrd | grep blogd
     grep -E 'quiet|fastboot' /proc/cmdline

to see if there is any blogd support therein and if so, if it is
used (that is no quiet nor fastboot should be part of the kernels
command line).
Comment 27 Jiri Slaby 2011-02-22 21:21:24 UTC
Created attachment 415610 [details]
screenlog with updated blogd

(In reply to comment #26)
> Hmmm ... that is strange, are you sure that you have removed the
> workaround mentioned in comment #7 ... just run
> 
>      lsinitrd /boot/initrd | grep blogd

Yeah, my bad. chmod a+x /sbin/blogd helped. The issue persists.

I fed the output through s/[\r\n][\r\n]*/\r/ to get rid of 600k of crap.
Comment 28 Dr. Werner Fink 2011-03-02 11:12:53 UTC
... The only visible change at tty kernel API is

 CONFIG_UNIX98_PTYS=y
 CONFIG_DEVPTS_MULTIPLE_INSTANCES=y   <<<<<<<
 CONFIG_LEGACY_PTYS=y
 CONFIG_LEGACY_PTY_COUNT=0

that is what does devpts multiple instances do?
I do not understand why blogd is able to reread its own messages
send to both /dev/tty0 and /dev/ttyS0 even if the console is
explicit set to /dev/pts/0 and blogd only reads from the corresponding
/dev/ptm0.

Does this mean that with the actual kernel blogd has to close all
file descriptors for /dev/console (0, 1, and 2) to reopen them after
the ioctl TIOCCONS??
Comment 29 Dr. Werner Fink 2011-03-02 13:53:51 UTC
Created attachment 417071 [details]
/sbin/blogd for 11.4-x86_64

Yet an other blogd, now closing old console and dup pts
to stdout and stderr.
Comment 30 Greg Kroah-Hartman 2011-03-09 17:10:57 UTC
i'm confused, what do you need from me here?
Comment 31 Dr. Werner Fink 2011-03-10 08:15:57 UTC
(In reply to comment #30)

I'd like to know what difference between the kernel of 11.3 and 11.4
causes that blogd reads its own output to the psysical devices of
a serial console from the system console even after the ioctl TIOCCONS
for redirecting console outout to a pts device.
Comment 32 Dr. Werner Fink 2011-03-10 08:19:41 UTC
I've added Hannes to carbon copy list.
Hi Hannes, you'll find in attchment #417071 an blogd binary for
x86_64 which explicit closes file descriptor 0 aka /dev/console
to open the pts device after the ioctl TIOCCONS as new system
console.  The question is: does this work around the new
(mis)behaviour of the kernel of 11.4
Comment 33 Hannes Reinecke 2011-03-10 13:51:50 UTC
Yes, the blogd from comment #29 fixes the problem for me.
Comment 34 Dr. Werner Fink 2011-03-10 14:11:50 UTC
OK ... then this change should go to 11.4
Comment 35 Dr. Werner Fink 2011-03-10 14:24:50 UTC
(In reply to comment #34)

I've submitted a fixed sysvinit package for this bug #672450 and
for bug ##664941
Comment 36 Marcus Meissner 2011-03-14 14:46:41 UTC
the sysvinit submission currentöly in 11.4 is the same as we shipped with 11.4.

(obs request 63912)

top of changes is still:
-------------------------------------------------------------------
Tue Feb 15 15:44:33 CET 2011 - werner@suse.de

- New killproc version 2.19: bug fix update

and there is no diff to the 11.4 shipping one.
Comment 37 Dr. Werner Fink 2011-03-14 15:01:19 UTC
See obs equest id 64128
Comment 38 Marcus Meissner 2011-03-14 15:31:53 UTC
seems sensible +1
Comment 39 Marcus Meissner 2011-03-14 15:36:29 UTC
ok, i suvbnmitted a patchinfo already
Comment 40 Dr. Werner Fink 2011-03-15 09:29:31 UTC
fixed
Comment 41 Dr. Werner Fink 2011-03-23 11:43:22 UTC
Does not work in virtual systems -> race condition -> kernel bug!
Comment 42 Dr. Werner Fink 2011-03-23 12:08:24 UTC
The fixed version does its work for

   console=ttyS0,38400n8 console=tty0

in /boot/grub/menu.lst but *not* for the reverse order

   console=tty0 console=ttyS0,38400n8

something wents wrong in the kernel. IMHO this is a regression
in the kernel code for serial console handling as this had
worked flawless with 2.6.34.

See also bug #642289 ... I've tried to debug this bug on a
virtual system, therefore I've detected the behaviour above.
Comment 43 Dr. Werner Fink 2011-03-24 12:35:58 UTC
Yep, there was a change in ttyname(3) which affects only in initrd.
Comment 44 Dr. Werner Fink 2011-03-25 11:24:00 UTC
dup of 642289

*** This bug has been marked as a duplicate of bug 642289 ***
Comment 45 Bernhard Wiedemann 2011-07-26 13:00:40 UTC
This is an autogenerated message for OBS integration:
This bug (672450) was mentioned in
https://build.opensuse.org/request/show/77067 Factory / sysvinit