[2/2] EM28xx - fix deadlock when unplugging and replugging a DVB adapter

Message ID 4E4F9E86.7030001@yahoo.com (mailing list archive)
State Rejected, archived
Headers

Commit Message

Chris Rankin Aug. 20, 2011, 11:46 a.m. UTC
  The final patch removes the unplug/replug deadlock by not holding the device 
mutex during dvb_init(). However, this mutex has already been locked during 
device initialisation by em28xx_usb_probe() and is not released again until all 
extensions have been initialised successfully.

The device mutex is not held during either em28xx_register_extension() or 
em28xx_unregister_extension() any more. More importantly, I don't believe it can 
safely be held by these functions because they must both - by their nature - 
acquire the device list mutex before they can iterate through the device list. 
In other words, while usb_probe() and usb_disconnect() acquire the device mutex 
followed by the device list mutex, the register/unregister_extension() functions 
would need to acquire these mutexes in the opposite order. And that sounds like 
a potential deadlock.

On the other hand, the new situation is a definite improvement :-).

Signed-off-by: Chris Rankin <rankincj@yahoo.com>
  

Comments

Mauro Carvalho Chehab Aug. 20, 2011, 12:34 p.m. UTC | #1
Em 20-08-2011 04:46, Chris Rankin escreveu:
> The final patch removes the unplug/replug deadlock by not holding the device mutex during dvb_init(). However, this mutex has already been locked during device initialisation by em28xx_usb_probe() and is not released again until all extensions have been initialised successfully.

No. The extension load can happen after the usb probe phase. In practice,
the only case where the extension init will happen together with the
usb probe phase is when the em28xx modules are compiled builtin, as the
module load is done asynchronously, and generally happens after the em28xx
core to load.

> The device mutex is not held during either em28xx_register_extension() or em28xx_unregister_extension() any more. More importantly, I don't believe it can safely be held by these functions because they must both - by their nature - acquire the device list mutex before they can iterate through the device list. In other words, while usb_probe() and usb_disconnect() acquire the device mutex followed by the device list mutex, the register/unregister_extension() functions would need to acquire these mutexes in the opposite order. And that sounds like a potential deadlock.
> 
> On the other hand, the new situation is a definite improvement :-).

No, it is a regression for a known bug.

This patch can cause troubles. The point is that, after initializing the
analog part, the device can be accessed via the V4L2 API, while it is
still initializing the DVB part. This actually happens in practice, as
when udev detects a new device, it opens it and calls VIDIOC_QUERYCAP.

So, it ends by having a race issue, as at V4L2 open, or at some analog mode
ioctl's, there are calls for em28xx_set_mode(dev, EM28XX_ANALOG_MODE).

In order words, if the DVB initialization is still happening, the driver
should not allow any V4L2 call, otherwise the DVB detection breaks.

Maybe the proper fix would be to change the logic under em28xx_usb_probe()
to not hold dev->lock anymore when the device is loading the extensions.
> 
> Signed-off-by: Chris Rankin <rankincj@yahoo.com>
> 
> 
> EM28xx-replug-deadlock.diff
> 
> 
> --- linux-3.0/drivers/media/video/em28xx/em28xx-dvb.c.orig	2011-08-19 00:50:41.000000000 +0100
> +++ linux-3.0/drivers/media/video/em28xx/em28xx-dvb.c	2011-08-19 00:51:03.000000000 +0100
> @@ -542,7 +542,6 @@
>  	dev->dvb = dvb;
>  	dvb->fe[0] = dvb->fe[1] = NULL;
>  
> -	mutex_lock(&dev->lock);
>  	em28xx_set_mode(dev, EM28XX_DIGITAL_MODE);
>  	/* init frontend */
>  	switch (dev->model) {
> @@ -711,7 +710,6 @@
>  	em28xx_info("Successfully loaded em28xx-dvb\n");
>  ret:
>  	em28xx_set_mode(dev, EM28XX_SUSPEND);
> -	mutex_unlock(&dev->lock);
>  	return result;
>  
>  out_free:

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
Chris Rankin Aug. 20, 2011, 2:40 p.m. UTC | #2
--- On Sat, 20/8/11, Mauro Carvalho Chehab <mchehab@redhat.com> wrote:
> No. The extension load can happen after the usb probe
> phase. In practice, the only case where the extension init will happen
> together with the usb probe phase is when the em28xx modules are
> compiled builtin

It also happens when someone plugs an adapter into the machine when the modules are already loaded. E.g. someone plugging a second adapter in, or unplugging and then replugging the same one.

> Maybe the proper fix would be to change the logic under
> em28xx_usb_probe() to not hold dev->lock anymore when the device is
> loading the extensions.

I could certainly write such a patch, although I only have a PCTV 290e adapter to test with.

Is this problem unique to the em28xx-dvb module? How does the em28xx-alsa module get away with creating ALSA devices without causing a similar race condition?

Cheers,
Chris

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
Mauro Carvalho Chehab Aug. 20, 2011, 3:02 p.m. UTC | #3
Em 20-08-2011 07:40, Chris Rankin escreveu:
> --- On Sat, 20/8/11, Mauro Carvalho Chehab <mchehab@redhat.com> wrote:
>> No. The extension load can happen after the usb probe
>> phase. In practice, the only case where the extension init will happen
>> together with the usb probe phase is when the em28xx modules are
>> compiled builtin
> 
> It also happens when someone plugs an adapter into the machine when the modules are already loaded. E.g. someone plugging a second adapter in, or unplugging and then replugging the same one.

Yes.

>> Maybe the proper fix would be to change the logic under
>> em28xx_usb_probe() to not hold dev->lock anymore when the device is
>> loading the extensions.
> 
> I could certainly write such a patch, although I only have a PCTV 290e adapter to test with.

We can test it on more devices.

> Is this problem unique to the em28xx-dvb module? How does the em28xx-alsa module get 
> away with creating ALSA devices without causing a similar race condition?

It might also affect alsa. Pulseaudio has the bad habit of opening the device. However,
the device lock is hold when the driver is changing the lock.

It should be noticed that the current mutex lock strategy is a workaround. The proper
solution is to work on resource lock library that could be used when the access of a
device via one API blocks the access of the same device using another API.

We had some discussions about that during the USB mini-summit on last Monday. We'll
probably discuss more about that during the Kernel Summit/media workshop.

Thanks,
Mauro

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  

Patch

--- linux-3.0/drivers/media/video/em28xx/em28xx-dvb.c.orig	2011-08-19 00:50:41.000000000 +0100
+++ linux-3.0/drivers/media/video/em28xx/em28xx-dvb.c	2011-08-19 00:51:03.000000000 +0100
@@ -542,7 +542,6 @@ 
 	dev->dvb = dvb;
 	dvb->fe[0] = dvb->fe[1] = NULL;
 
-	mutex_lock(&dev->lock);
 	em28xx_set_mode(dev, EM28XX_DIGITAL_MODE);
 	/* init frontend */
 	switch (dev->model) {
@@ -711,7 +710,6 @@ 
 	em28xx_info("Successfully loaded em28xx-dvb\n");
 ret:
 	em28xx_set_mode(dev, EM28XX_SUSPEND);
-	mutex_unlock(&dev->lock);
 	return result;
 
 out_free: