Bug 657605

Summary: zypp+curl chunked download produces corrupt files
Product: [openSUSE] openSUSE 11.4 Reporter: Bernhard Wiedemann <bwiedemann>
Component: libzyppAssignee: Michael Schröder <mls>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: lars.vogdt, Martin.Seidler
Version: Factory   
Target Milestone: ---   
Hardware: i686   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: zypper.log extract from corrupted download

Description Bernhard Wiedemann 2010-12-04 18:52:27 UTC
zypp+curl chunked download produces corrupt files

How To Reproduce:
1. I install Factory from KDE-LiveCD-i686-Build0915
2. possibly optional: add factory-tested repo
3. call yast2 -i
4. select "Accept" and "Continue"

Actual Results:
This starts to download rpms, first fetching the metalink.xml from download.o.o and then concurrently fetching chunks from different mirrors.
However, on big files, there is a certain chance that the result will be corrupted.
In one case, diff of the corrupted file with the intact file showed that between position 384KByte and 512KByte the corrupt file had content identical to position 0
Since zypp/curl uses 128KB chunks, the chunking/reassembly is probably faulty.
Full tcpdump capture is available.

Expected Results:
should produce the same content as wget

Reproducible: Always

Extra Info:
One of the 14 mirrors refused to deliver content from offsets:

> curl -v -C900 http://opensuse.mirrors.proxad.net/opensuse/factory-tested/repo/oss/README
* About to connect() to opensuse.mirrors.proxad.net port 80 (#0)
*   Trying 2a01:e0c:1:1598::1... connected
* Connected to opensuse.mirrors.proxad.net (2a01:e0c:1:1598::1) port 80 (#0)
> GET /opensuse/factory-tested/repo/oss/README HTTP/1.1
> Range: bytes=900-
> User-Agent: curl/7.21.2 (i686-pc-linux-gnu) libcurl/7.21.2 OpenSSL/1.0.0 zlib/1.2.5 libidn/1.15 libssh2/1.2.7
> Host: opensuse.mirrors.proxad.net
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/octet-stream
< ETag: "-1235578355"
< Last-Modified: Mon, 15 Nov 2010 06:28:34 GMT
< Content-Length: 1002
< Date: Sat, 04 Dec 2010 18:30:49 GMT
< Server: lighttpd/1.4.19
< 
* HTTP server doesn't seem to support byte ranges. Cannot resume.
* Closing connection #0
curl: (33) HTTP server doesn't seem to support byte ranges. Cannot resume.

The server responds with the full file as if there was no bytes= in the request

In the network capture it is also visible that identical data is requested 3-5 times from different servers, wasting a lot of network-bandwidth.
Comment 1 Bernhard Wiedemann 2010-12-05 14:38:43 UTC
Created attachment 403478 [details]
zypper.log extract from corrupted download

notice the occurrence of the proxad mirror with a non-zero starting-offset.
Comment 2 Bernhard Wiedemann 2010-12-05 19:46:33 UTC
workaround: export ZYPP_MULTICURL=0
Comment 3 Bernhard Wiedemann 2010-12-06 08:21:56 UTC
zypp/media/MediaMultiCurl.cc:241 does not remember if a reply has a "Content-Range:" header. 
RFC2616 only specifies this as a SHOULD, so there are servers out there, returning data from offset 0 with "200 OK" instead of "206 Partial Content", which then gets written to the wrong offset. 
3 of 4 samples of my corrupted rpm collection support this.
Comment 4 Michael Schröder 2010-12-06 11:03:42 UTC
I didn't implement a 206 check because metalinks normally contain block checksums, which allow the client to detect misbehaving servers. (I also don't know what ftp servers return.)
There seems to be something wrong with our download infrastructure, because the metalink file doesn't contain any block checksums. That's why multicurl doesn't detect this error. So I think you're right, we should check for 206.
Comment 5 Bernhard Wiedemann 2010-12-18 17:03:43 UTC
This issue should be fixed with libzypp-8.10.2 as shipped with 11.4-MS5.
If you are still having an older install, be sure to update libzypp first.