Bug 1160922 (CVE-2019-18904) - VUL-0: CVE-2019-18904: rmt: Offline migrations endpoint eats up all of the CPU
Summary: VUL-0: CVE-2019-18904: rmt: Offline migrations endpoint eats up all of the CPU
Status: RESOLVED FIXED
Alias: CVE-2019-18904
Product: SUSE Security Incidents
Classification: Novell Products
Component: Audits (show other bugs)
Version: unspecified
Hardware: Other Other
: P1 - Urgent : Major
Target Milestone: ---
Assignee: Security Team bot
QA Contact: Security Team bot
URL: https://trello.com/c/yGEzRyNK
Whiteboard: CVSSv3:SUSE:CVE-2019-18904:7.5:(AV:N/...
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-14 14:29 UTC by Ivan Kapelyukhin
Modified: 2021-05-18 09:21 UTC (History)
11 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
migration_client.rb that reproduces the issue (3.59 KB, application/x-ruby)
2020-01-14 14:29 UTC, Ivan Kapelyukhin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ivan Kapelyukhin 2020-01-14 14:29:53 UTC
Created attachment 827518 [details]
migration_client.rb that reproduces the issue

When doing an offline migration from SLES 12 SP4 to SLES 15 SP1, the request to offline migrations endpoint times out and hangs RMT (stuck at 100% CPU load). During this time `zypper migration` dies with a timeout while waiting for the response.

It seems that this is due to the number of possible migrations exploding in the migration engine. The log shows thousands upon thousands of requests made to the DB.

I have attached migration_client.rb configured to reproduce the issue.

This is a major issue, as not only does it prevent the customer from migrating, but also brings down RMT.
Comment 1 Thomas Schmidt 2020-01-14 14:33:55 UTC
I think we fixed that in SCC, and forgot to patch RMT. Worth comparing the code of the migration engine.
Comment 2 Ivan Kapelyukhin 2020-01-14 14:36:49 UTC
(In reply to Thomas Schmidt from comment #1)
> I think we fixed that in SCC, and forgot to patch RMT. Worth comparing the
> code of the migration engine.

Yeah, I've looked at the code -- and it's different, the implementations have diverged at some point.

I'm not sure if I can fix it myself -- I can't really verify if it is producing the correct migration paths or not.
Comment 3 Ivan Kapelyukhin 2020-01-15 11:01:47 UTC
We are seeing this issue in the wild:

> PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                      
> 73611 _rmt      20   0 1039944 223212   8048 S 100.0 2.744  55:11.87 rails 

A single offline migrations request has been made:

> rmt-azure-1-australiaeast:/var/log/nginx # cat rmt_https_access.log | grep offline
> 13.75.217.161 - SCC_9491d5b375df41fc8d486c36179e3668 [15/Jan/2020:10:04:50 +0000] "POST /connect/systems/products/offline_migrations HTTP/1.1" 499 0 "-" "SUSEConnect/0.3.16"
> rmt-azure-1-australiaeast:/var/log/nginx # date
> Wed Jan 15 10:52:09 UTC 2020

I'm bumping the priority on this bug.
Comment 4 Thomas Hutterer 2020-01-15 11:20:38 UTC
I made this a fastlane task in our team's Kanban: https://trello.com/c/yGEzRyNK
We'll fix it asap!
Comment 5 Ivan Kapelyukhin 2020-01-15 11:57:49 UTC
Online migrations are also affected (the clients are catching read timeouts and are falling off, nginx uses HTTP code 499 for this):

rmt-ec2-1-eu-west-1b:

> 54.246.215.63 - SCC_ad7dfb5b99634b0db2fe34fa68d566e3 [15/Jan/2020:11:04:53 +0000] "POST /connect/systems/products/migrations HTTP/1.1" 499 0 "-" "SUSEConnect/0.3.11"

rmt-ec2-1-us-east-2c:

> 18.219.81.203 - SCC_f5b6a28ad01448bea01671a5b729736d [15/Jan/2020:03:42:42 +0000] "POST /connect/systems/products/migrations HTTP/1.1" 499 0 "-" "SUSEConnect/0.3.17"
> 18.219.81.203 - SCC_f5b6a28ad01448bea01671a5b729736d [15/Jan/2020:03:58:13 +0000] "POST /connect/systems/products/migrations HTTP/1.1" 499 0 "-" "SUSEConnect/0.3.17"
> 18.219.81.203 - SCC_f5b6a28ad01448bea01671a5b729736d [15/Jan/2020:04:08:59 +0000] "POST /connect/systems/products/migrations HTTP/1.1" 499 0 "-" "SUSEConnect/0.3.17"



Occasionally it seems to succeed, but I guess it depends on what kind of products are activated/in migration paths:

rmt-azure-3-westeurope-3:

> 51.144.187.210 - SCC_64c63ea85a66499ba12dc4d6d738c6f6 [15/Jan/2020:06:26:36 +0000] "POST /connect/systems/products/migrations HTTP/1.1" 200 4830 "-" "SUSEConnect/0.3.22"
Comment 6 Marcus Meissner 2020-01-17 13:10:01 UTC
Please reference CVE-2019-18904
Comment 8 Swamp Workflow Management 2020-01-30 11:18:04 UTC
SUSE-SU-2020:0260-1: An update that solves one vulnerability and has three fixes is now available.

Category: security (important)
Bug References: 1141122,1157119,1160673,1160922
CVE References: CVE-2019-18904
Sources used:
SUSE Linux Enterprise Server for SAP 15 (src):    rmt-server-2.5.2-3.26.1
SUSE Linux Enterprise Server 15-LTSS (src):    rmt-server-2.5.2-3.26.1
SUSE Linux Enterprise Module for Server Applications 15 (src):    rmt-server-2.5.2-3.26.1
SUSE Linux Enterprise High Performance Computing 15-LTSS (src):    rmt-server-2.5.2-3.26.1
SUSE Linux Enterprise High Performance Computing 15-ESPOS (src):    rmt-server-2.5.2-3.26.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 9 Swamp Workflow Management 2020-01-31 14:13:01 UTC
SUSE-SU-2020:0278-1: An update that solves one vulnerability and has three fixes is now available.

Category: security (important)
Bug References: 1141122,1157119,1160673,1160922
CVE References: CVE-2019-18904
Sources used:
SUSE Linux Enterprise Module for Server Applications 15-SP1 (src):    rmt-server-2.5.2-3.9.1
SUSE Linux Enterprise Module for Public Cloud 15-SP1 (src):    rmt-server-2.5.2-3.9.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 10 Jens Mammen 2020-02-04 13:16:29 UTC
Maintenance update (2.5.2) is released.
Comment 12 Swamp Workflow Management 2020-02-18 09:40:50 UTC
This is an autogenerated message for OBS integration:
This bug (1160922) was mentioned in
https://build.opensuse.org/request/show/775075 Factory / rmt-server
Comment 13 Swamp Workflow Management 2020-02-19 23:11:27 UTC
openSUSE-SU-2020:0235-1: An update that solves one vulnerability and has three fixes is now available.

Category: security (important)
Bug References: 1141122,1157119,1160673,1160922
CVE References: CVE-2019-18904
Sources used:
openSUSE Leap 15.1 (src):    rmt-server-2.5.2-lp151.2.9.1
Comment 14 Johannes Segitz 2020-04-03 06:52:57 UTC
making public to use as a reference for the CVE
Comment 16 Swamp Workflow Management 2020-05-05 13:38:20 UTC
SUSE-SU-2020:1179-1: An update that solves one vulnerability and has four fixes is now available.

Category: security (moderate)
Bug References: 1136020,1160922,1162296,1165548,1168554
CVE References: CVE-2019-18904
Sources used:
SUSE Linux Enterprise Server for SAP 15 (src):    rmt-server-2.5.7-3.31.1
SUSE Linux Enterprise Server 15-LTSS (src):    rmt-server-2.5.7-3.31.1
SUSE Linux Enterprise High Performance Computing 15-LTSS (src):    rmt-server-2.5.7-3.31.1
SUSE Linux Enterprise High Performance Computing 15-ESPOS (src):    rmt-server-2.5.7-3.31.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 17 OBSbugzilla Bot 2021-05-18 09:21:25 UTC
This is an autogenerated message for OBS integration:
This bug (1160922) was mentioned in
https://build.opensuse.org/request/show/893979 Factory / rmt-server