Bug 1003085

Summary: rabbitmq-server fails to start when epmd is started from service
Product: [openSUSE] openSUSE Distribution Reporter: Theo Chatzimichos <tchatzimichos>
Component: OtherAssignee: Dirk Mueller <dmueller>
Status: RESOLVED WONTFIX QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: mchandras, mrueckert, ralf
Version: Leap 42.1   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Theo Chatzimichos 2016-10-05 13:12:38 UTC
The following issue affects the currently latest maintenance update for Leap 42.1 for rabbitmq-server, 3.5.1-4.1. It works fine with 3.5.1-2.7.

When starting epmd via service, rabbitmq-server fails to start. It works if we don't start epmd from the service, but instead let rabbitmq start it. A few command outputs below:

## working scenario

odb:~ # ps aux | grep epmd | grep -v grep
odb:~ # rcrabbitmq-server start
odb:~ # rcrabbitmq-server status
rabbitmq-server.service - RabbitMQ broker
   Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled)
   Active: active (running) since Wed 2016-10-05 13:02:44 UTC; 2s ago
  Process: 5615 ExecStop=/usr/sbin/rabbitmqctl stop (code=exited, status=0/SUCCESS)
 Main PID: 5750 (epmd)
   Status: "Exited."
   CGroup: /system.slice/rabbitmq-server.service
           ├─5731 /bin/sh /usr/sbin/rabbitmq-server
           ├─5735 /usr/lib64/erlang/erts-7.0.3/bin/beam.smp -W w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib64/rabbitmq/lib/rabbitmq_server-3.5.1/sbin/../ebin -noshell -noinput -s rabbit boot -sname rabbit@localhost -boot start_sasl -config /etc/rabbitmq/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -rabbit tcp_listeners [{"auto",5672}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/var/log/rabbitmq/rabbit@localhost.log"} -rabbit sasl_error_logger {file,"/var/log/rabbitmq/rabbit@localhost-sasl.log"} -rabbit enabled_plugins_file "/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/usr/lib64/rabbitmq/lib/rabbitmq_server-3.5.1/sbin/../plugins" -rabbit plugins_expand_dir "/var/lib/rabbitmq/mnesia/rabbit@localhost-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/rabbit@localhost" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672
           └─5750 /usr/lib64/erlang/erts-7.0.3/bin/epmd -daemon

Oct 05 13:02:44 odb systemd[1]: Starting RabbitMQ broker...
Oct 05 13:02:44 odb systemd[1]: Started RabbitMQ broker.
odb:~ # ps aux | grep epmd | grep -v grep
rabbitmq  5750  0.0  0.0  25228   208 ?        S    13:02   0:00 /usr/lib64/erlang/erts-7.0.3/bin/epmd -daemon

## non-working scenario
odb:~ # ps aux | grep epmd | grep -v grep
odb:~ # rcepmd start
odb:~ # rcepmd status
epmd.service - Erlang Port Mapper Daemon
   Loaded: loaded (/usr/lib/systemd/system/epmd.service; enabled)
   Active: active (running) since Wed 2016-10-05 13:05:14 UTC; 1s ago
 Main PID: 5992 (epmd)
   CGroup: /system.slice/epmd.service
           └─5992 /usr/bin/epmd -systemd

Oct 05 13:05:14 odb systemd[1]: Starting Erlang Port Mapper Daemon...
Oct 05 13:05:14 odb systemd[1]: Started Erlang Port Mapper Daemon.
odb:~ # ps aux | grep epmd | grep -v grep
epmd      5992  0.0  0.0  25228  1480 ?        Ss   13:05   0:00 /usr/bin/epmd -systemd
odb:~ # rcrabbitmq-server start
Job for rabbitmq-server.service failed. See "systemctl status rabbitmq-server.service" and "journalctl -xn" for details.

journactl reports:
Oct 05 13:07:18 odb systemd[1]: Starting RabbitMQ broker...
Oct 05 13:07:18 odb systemd[1]: Cannot find unit for notify message of PID 6294.
Oct 05 13:08:48 odb systemd[1]: rabbitmq-server.service start operation timed out. Terminating.
Oct 05 13:08:48 odb systemd[1]: Failed to start RabbitMQ broker.
Oct 05 13:08:48 odb systemd[1]: Unit rabbitmq-server.service entered failed state.
Comment 1 Markos Chandras 2016-10-05 13:34:57 UTC
This also likely affects Tumbleweed, Leap 42.2 and SLE 12 SP2 because all of them share the same service file.

For reference the upstream service file recommendation is this:

https://github.com/rabbitmq/rabbitmq-server/blob/90a1e737dec654dacb284c8245c163d846070b34/docs/rabbitmq-server.service.example

but we can't use it because our erlang-epmd epmd.socket/service always run on 127.0.0.1. The upstream file uses a epmd@ template to make epmd@ listen to all addresses.

Previously, rabbitmq-server could only run on 127.0.0.1.

The current situation is not ideal, but it currently allows you to run rabbitmq-server on whatever IP you want.

So if rabbitmq-server needs fixing, it's best to fix epmd at the same time.
Comment 2 Dirk Mueller 2016-10-05 15:14:12 UTC
yes, rabbitmq requires epmd listen on any ip address.
Comment 5 Theo Chatzimichos 2016-12-14 13:41:49 UTC
the problem is not happening any more on 42.2, feel free to close the ticket
Comment 6 Markos Chandras 2017-06-06 18:09:26 UTC
(In reply to Theo Chatzimichos from comment #5)
> the problem is not happening any more on 42.2, feel free to close the ticket

Slightly late (sorry just started using 42.2 for on the rabbitmq-server host) but I am still facing the same issue on 42.2 mainly because the socket still listens on 127.0.0.1

https://build.opensuse.org/package/view_file/devel:languages:erlang:Factory/erlang/epmd.socket?expand=1

I don't see where upstream defaults to localhost. Marcus where is that?

I still see the recommended service file to be 

https://github.com/rabbitmq/rabbitmq-server/blob/1b0096a925ba56af16d4776713a5c8e9593c587a/docs/rabbitmq-server.service.example

why can't we have the same?
Comment 7 Marcus Rückert 2017-06-06 19:34:00 UTC
because it is much easier to do the stuff shown in our README.SUSE

https://build.opensuse.org/package/view_file/devel:languages:erlang:Factory/erlang/README.SUSE?expand=1
Comment 8 Markos Chandras 2017-06-07 15:09:40 UTC
OK talked to Marcus. I can live with the proposed solution. Seems like Dirk already has a workaround for openstack. I will also apply what Marcus suggested. Therefore I am closing this bug as WONTFIX