BIND 10 #2942: b10-auth terminated with signal 6, Aborted.
BIND 10 Development
do-not-reply at isc.org
Mon May 6 21:16:30 UTC 2013
#2942: b10-auth terminated with signal 6, Aborted.
-------------------------------------+-------------------------------------
Reporter: jreed | Owner:
Type: defect | Status: new
Priority: very high | Milestone: Next-
Component: b10-auth | Sprint-Proposed
Keywords: | Resolution:
Sensitive: 0 | CVSS Scoring:
Sub-Project: DNS | Defect Severity: Very
Estimated Difficulty: 0 | High
Total Hours: 0 | Feature Depending on Ticket:
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
Comment (by jinmei):
This is a use-after-free problem.
According to the backtrace, what happened appears to be:
- b10-auth first sets listen_on at the time of creating
`ModuleCCSession`. It eventually creates corresponding
`SyncUDPServer` (and other `DNSServer`) objects, which register
themselves with read event callbacks for the ASIO io_service.
- In the case of `SyncUDPServer`, async_receive_from() is called,
which first checks if there's any readable data using non blocking
I/O, and if there is post an event to the main loop of the
io_service (at this point such events cannot be canceled any more).
As the production AS112 server is very busy, I guess the `SyncUDPServer`
effectively starts receiving queries at this point.
- b10-auth then calls configureAuthServer() to all install user
configurations. For listen_on, this means the previously created
`SyncUDPServer` (and other `DNSServer`) objects are destroyed:
{{{#!cpp
void
DNSService::clearServers() {
BOOST_FOREACH(const DNSServiceImpl::DNSServerPtr& s, impl_->servers_)
{
s->stop();
}
impl_->servers_.clear();
}
}}}
- SyncUDPServer::stop() closes the socket, but at this point any
already posted completed read event isn't affected. It will be
still trigger the read callback eventually.
- the `SyncUDPServer` object itself is destroyed at this point.
- then, to complete configureAuthServer(), b10-auth communicates with
other modules via the CC session, which causes calls to
io_service::run_one(). One of such calls result in the call to the
`SyncUDPServer` callback (this is actually happening according to
the backtrace)
- but at this point `this` server object has already been destroyed,
so any behavior that relies on the local member variables of the
object is undefined. That should be the reason we saw "calling pure
virtual" or other strange crash.
Before #2903, this code should somehow prevented catastrophic results:
{{{#!cpp
if (checkin_callback_ != NULL) {
(*checkin_callback_)(message);
if (stopped_) {
return;
}
}
}}}
While it still uses local member variables of the destroyed object and
should cause crash or other troubles, we were probably lucky and both
`checkin_callback_` and `stopped_` retained the original value.
So, simply reverting #2903 wouldn't be a real solution. What we
should do is to control the lifetime of the server objects so they are
not destroyed until all posted events are completed.
But, for the 1.1.0-release (beta), I suggest a workaround that is not
really correct but just as bad as pre-#2903
(http://bind10.isc.org/raw-attachment/ticket/2942/udp-server3.diff),
and developing a complete fix (not only for `SyncUDPServer` but also
for other DNS server classes) separately.
--
Ticket URL: <http://bind10.isc.org/ticket/2942#comment:18>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list