|
Bugzilla – Full Text Bug Listing |
| Summary: | can not start crypto helper: failed to find any available worker | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE Linux 10.1 | Reporter: | Andreas Schwab <schwab> |
| Component: | Network | Assignee: | Marius Tomaschewski <mt> |
| Status: | RESOLVED WONTFIX | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | lmuelle, radmanic, suse-beta |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | grep -F 'pluto[17401]' /var/log/messages | ||
|
Description
Andreas Schwab
2006-06-17 15:16:44 UTC
Created attachment 90031 [details]
grep -F 'pluto[17401]' /var/log/messages
If you have more than two CPUs (reported by sysconf(_SC_NPROCESSORS_ONLN))
pluto starts ncpu_online-1 helpers. Otherwise it starts only one helper.
BTW: You can also override this value using the "nhelpers" parameter in
the "config setup" section of the /etc/ipsec.conf.
What happens is, that if you have multiple tunnels (to one destination) the
reinit of the IPSEC SAs are done asynchronously -- but they are serialized,
because there are no avaliable worker in this moment, that is, all workers
(usually only one) are busy with work for an another tunnel.
If this happens, the tunnel SA may expire, but it is marked for a reinit
on demand. As soon as there are packets for this tunnel, pluto reinits
it again.
You can see in log lines like this:
Jun 17 17:10:09 whitebox pluto[17401]:
initiate on demand from 10.204.0.116:0 to 149.44.160.50:0 proto=0 state:
fos_start because: acquire
"schwab-novell1" #26: initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP
{using isakmp#21}
"schwab-novell1" #26: transition from state STATE_QUICK_I1 to state
STATE_QUICK_I2
"schwab-novell1" #26: STATE_QUICK_I2: sent QI2, IPsec SA established
...
This is how pluto is wokring now... I can't change this default.
You can also set the ("keep_alive=20" and) "force_keepalive=yes"
options in "config setup" of /etc/ipsec.conf.
Then why does it _never_ happen without NAT traversal? Neither keep_alive nor force_keepalive are documented in ipsec.conf(5). Yes, I know that they're documented. nhelpers=< number of helpers >= 0, 0 disables use of helpers > keep_alive=< in seconds, e.g. 20 > force_keepalive=< yes | no > I've tested the actual 2.4.5 - there is no behaviour difference. The "initiate on demand" is still used (and I think it'll remain). Today I started to test the 2.4.6rc1 version... I'll submit it to our BETA dist tree later. (In reply to comment #5) > Yes, I know that they're documented. not force_keepalive does not change anything. BTW: I reported this issue long time ago (08-26-05) in the openswan bug tracking system: http://bugs.xelerance.com/view.php?id=412 It is still open and assigned to mcr at xelerance until now... I've updated to openswan-2.4.6 (in BETA at the moment) and built RPMs for 10.0 and 10.1 (and stable) at: http://www.suse.de/~mt/openswan/RPMs/ Now, the ipsec.conf contains an "nhelpers=0" by default that should avoid this problem. Please try out if it works for you. Thanks! Fixed by "nhelpers=0" option, that is used by default on STABLE (10.2). Marius thinks that my patch in Bug #234042 may fix this problem so the nhelpers = 0 workaround isn't needed any more. Andreas, could you give it a try? If not we should close the ticket again, otherwise we should mark as duplicate. It doesn't help. Is this still an open bug? Did the proposed workaround (adding "nhelpers=0") do the trick? At least the patch for fix of Bug #234042 does not seem to affect this bug as indicated in comment #11? Please comment. No, as Andreas already wrote, the fix from bug 234042 does not help against this problem. The "nhelpers=0" workaround is still needed and is currently the only "official fix" as provided upstream. |