Mantis CAM not SMP safe / Activating CAM on Technisat Skystar HD2 (DVB-S2)

Message ID 4EE5E0BE.4060300@kolumbus.fi (mailing list archive)
State RFC, archived
Headers

Commit Message

Marko Ristola Dec. 12, 2011, 11:08 a.m. UTC
  On 12/10/2011 01:57 AM, Ninja wrote:
> Hi,
>
> has anyone an idea how the SMP problems could be fixed?

You could turn on Mantis Kernel module's debug messages.
It could tell you the emitted interrupts.

One risky thing with the Interrupt handler code is that
MANTIS_GPIF_STATUS is cleared, even though IRQ0 isn't active yet.
This could lead to a rare starvation of the wait queue you described.
I supplied a patch below. Does it help?

> I did some further investigation. When comparing the number of interrupts with all cores enabled and the interrupts with only one core enabled it seems like only the IRQ0 changed, the other IRQs and the total number stays quite the same:
>
> 4 Cores:
> All IRQ/sec: 493
> Masked IRQ/sec: 400
> Unknown IRQ/sec: 0
> DMA/sec: 400
> IRQ-0/sec: 143
> IRQ-1/sec: 0
> OCERR/sec: 0
> PABRT/sec: 0
> RIPRR/sec: 0
> PPERR/sec: 0
> FTRGT/sec: 0
> RISCI/sec: 258
> RACK/sec: 0
>
> 1 Core:
> All IRQ/sec: 518
> Masked IRQ/sec: 504
> Unknown IRQ/sec: 0
> DMA/sec: 504
> IRQ-0/sec: 246
> IRQ-1/sec: 0
> OCERR/sec: 0
> PABRT/sec: 0
> RIPRR/sec: 0
> PPERR/sec: 0
> FTRGT/sec: 0
> RISCI/sec: 258
> RACK/sec: 0
>
> So, where might be the problem?
Turning on Mantis debug messages, might tell the difference between these interrupts.

....
> I hope somebody can help, because I think we are very close to a fully functional CAM here.
> I ran out of things to test to get closer to the solution :(
> Btw: Is there any documentation available for the mantis PCI bridge?
Not that I know.

>
> Manuel
>
>
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


Regards,
Marko Ristola

----------------------- PATCH ------------------------------
Mantis/Hopper: Check and clear GPIF status bits only when IRQ0 bit is active.

Signed-off-by: Marko Ristola <Marko.Ristola@kolumbus.fi>

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  

Comments

Ninja Dec. 13, 2011, 9:30 p.m. UTC | #1
Am 12.12.2011 12:08, schrieb Marko Ristola:
> On 12/10/2011 01:57 AM, Ninja wrote:
>> Hi,
>>
>> has anyone an idea how the SMP problems could be fixed?
>
> You could turn on Mantis Kernel module's debug messages.
> It could tell you the emitted interrupts.
>
> One risky thing with the Interrupt handler code is that
> MANTIS_GPIF_STATUS is cleared, even though IRQ0 isn't active yet.
> This could lead to a rare starvation of the wait queue you described.
> I supplied a patch below. Does it help?
>
>> I did some further investigation. When comparing the number of 
>> interrupts with all cores enabled and the interrupts with only one 
>> core enabled it seems like only the IRQ0 changed, the other IRQs and 
>> the total number stays quite the same:
>>
>> 4 Cores:
>> All IRQ/sec: 493
>> Masked IRQ/sec: 400
>> Unknown IRQ/sec: 0
>> DMA/sec: 400
>> IRQ-0/sec: 143
>> IRQ-1/sec: 0
>> OCERR/sec: 0
>> PABRT/sec: 0
>> RIPRR/sec: 0
>> PPERR/sec: 0
>> FTRGT/sec: 0
>> RISCI/sec: 258
>> RACK/sec: 0
>>
>> 1 Core:
>> All IRQ/sec: 518
>> Masked IRQ/sec: 504
>> Unknown IRQ/sec: 0
>> DMA/sec: 504
>> IRQ-0/sec: 246
>> IRQ-1/sec: 0
>> OCERR/sec: 0
>> PABRT/sec: 0
>> RIPRR/sec: 0
>> PPERR/sec: 0
>> FTRGT/sec: 0
>> RISCI/sec: 258
>> RACK/sec: 0
>>
>> So, where might be the problem?
> Turning on Mantis debug messages, might tell the difference between 
> these interrupts.
>
> ....
>> I hope somebody can help, because I think we are very close to a 
>> fully functional CAM here.
>> I ran out of things to test to get closer to the solution :(
>> Btw: Is there any documentation available for the mantis PCI bridge?
> Not that I know.
>
>>
>> Manuel
>>
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-media" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
> Regards,
> Marko Ristola
>

Hi Marko,

thanks for the patch. I did some quick testing today. The IRQ0 problem 
stays, but it seems like the small hangs (3-5 seconds every 20 minutes 
or something) are fixed :)

Manuel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
Ninja Dec. 16, 2011, 8:10 a.m. UTC | #2
Am 13.12.2011 22:30, schrieb Ninja:
> Am 12.12.2011 12:08, schrieb Marko Ristola:
>> On 12/10/2011 01:57 AM, Ninja wrote:
>>> Hi,
>>>
>>> has anyone an idea how the SMP problems could be fixed?
>>
>> You could turn on Mantis Kernel module's debug messages.
>> It could tell you the emitted interrupts.
>>
>> One risky thing with the Interrupt handler code is that
>> MANTIS_GPIF_STATUS is cleared, even though IRQ0 isn't active yet.
>> This could lead to a rare starvation of the wait queue you described.
>> I supplied a patch below. Does it help?
>>
>>> I did some further investigation. When comparing the number of 
>>> interrupts with all cores enabled and the interrupts with only one 
>>> core enabled it seems like only the IRQ0 changed, the other IRQs and 
>>> the total number stays quite the same:
>>>
>>> 4 Cores:
>>> All IRQ/sec: 493
>>> Masked IRQ/sec: 400
>>> Unknown IRQ/sec: 0
>>> DMA/sec: 400
>>> IRQ-0/sec: 143
>>> IRQ-1/sec: 0
>>> OCERR/sec: 0
>>> PABRT/sec: 0
>>> RIPRR/sec: 0
>>> PPERR/sec: 0
>>> FTRGT/sec: 0
>>> RISCI/sec: 258
>>> RACK/sec: 0
>>>
>>> 1 Core:
>>> All IRQ/sec: 518
>>> Masked IRQ/sec: 504
>>> Unknown IRQ/sec: 0
>>> DMA/sec: 504
>>> IRQ-0/sec: 246
>>> IRQ-1/sec: 0
>>> OCERR/sec: 0
>>> PABRT/sec: 0
>>> RIPRR/sec: 0
>>> PPERR/sec: 0
>>> FTRGT/sec: 0
>>> RISCI/sec: 258
>>> RACK/sec: 0
>>>
>>> So, where might be the problem?
>> Turning on Mantis debug messages, might tell the difference between 
>> these interrupts.
>>
>> ....
>>> I hope somebody can help, because I think we are very close to a 
>>> fully functional CAM here.
>>> I ran out of things to test to get closer to the solution :(
>>> Btw: Is there any documentation available for the mantis PCI bridge?
>> Not that I know.
>>
>>>
>>> Manuel
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe 
>>> linux-media" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> Regards,
>> Marko Ristola
>>
>
> Hi Marko,
>
> thanks for the patch. I did some quick testing today. The IRQ0 problem 
> stays, but it seems like the small hangs (3-5 seconds every 20 minutes 
> or something) are fixed :)
>
> Manuel

Hi,

I did some further investigation of my problem. Almost all IRQ0s 
originate from calling the function "mantis_hif_read_iom" (at least when 
the CAM is up and running). Changing the udelay between the writes to 
about 100 gets almost rid of the lost IRQ0 problem, but somehow it 
increases the number of total interrupts and IRQ0 as well to about 
double to triple of the numbers with udelay(20).
This increase doesn't happen when reducing the number of cores as 
workaround.
And getting *almost* no timeouts doesn't help much, because every 
timeout causes a hang/freeze until the CAM is initialized again.
Changing the PCI latency to 0xff didn't help either.

btw: The DMA patches of Marko postet in the other thread "Multiple 
Mantis devices gives me glitches" doesn't help me further since I'm 
using the latest code which already includes the patch.

Manuel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  

Patch

diff --git a/drivers/media/dvb/mantis/hopper_cards.c b/drivers/media/dvb/mantis/hopper_cards.c
index 71622f6..c2084e9 100644
--- a/drivers/media/dvb/mantis/hopper_cards.c
+++ b/drivers/media/dvb/mantis/hopper_cards.c
@@ -84,15 +84,6 @@  static irqreturn_t hopper_irq_handler(int irq, void *dev_id)
  	if (!(stat & mask))
  		return IRQ_NONE;
  
-	rst_mask  = MANTIS_GPIF_WRACK  |
-		    MANTIS_GPIF_OTHERR |
-		    MANTIS_SBUF_WSTO   |
-		    MANTIS_GPIF_EXTIRQ;
-
-	rst_stat  = mmread(MANTIS_GPIF_STATUS);
-	rst_stat &= rst_mask;
-	mmwrite(rst_stat, MANTIS_GPIF_STATUS);
-
  	mantis->mantis_int_stat = stat;
  	mantis->mantis_int_mask = mask;
  	dprintk(MANTIS_DEBUG, 0, "\n-- Stat=<%02x> Mask=<%02x> --", stat, mask);
@@ -101,6 +92,16 @@  static irqreturn_t hopper_irq_handler(int irq, void *dev_id)
  	}
  	if (stat & MANTIS_INT_IRQ0) {
  		dprintk(MANTIS_DEBUG, 0, "<%s>", label[1]);
+
+		rst_mask  = MANTIS_GPIF_WRACK  |
+			    MANTIS_GPIF_OTHERR |
+			    MANTIS_SBUF_WSTO   |
+			    MANTIS_GPIF_EXTIRQ;
+
+		rst_stat  = mmread(MANTIS_GPIF_STATUS);
+		rst_stat &= rst_mask;
+		mmwrite(rst_stat, MANTIS_GPIF_STATUS);
+
  		mantis->gpif_status = rst_stat;
  		wake_up(&ca->hif_write_wq);
  		schedule_work(&ca->hif_evm_work);
diff --git a/drivers/media/dvb/mantis/mantis_cards.c b/drivers/media/dvb/mantis/mantis_cards.c
index c2bb90b..109a5fb 100644
--- a/drivers/media/dvb/mantis/mantis_cards.c
+++ b/drivers/media/dvb/mantis/mantis_cards.c
@@ -92,15 +92,6 @@  static irqreturn_t mantis_irq_handler(int irq, void *dev_id)
  	if (!(stat & mask))
  		return IRQ_NONE;
  
-	rst_mask  = MANTIS_GPIF_WRACK  |
-		    MANTIS_GPIF_OTHERR |
-		    MANTIS_SBUF_WSTO   |
-		    MANTIS_GPIF_EXTIRQ;
-
-	rst_stat  = mmread(MANTIS_GPIF_STATUS);
-	rst_stat &= rst_mask;
-	mmwrite(rst_stat, MANTIS_GPIF_STATUS);
-
  	mantis->mantis_int_stat = stat;
  	mantis->mantis_int_mask = mask;
  	dprintk(MANTIS_DEBUG, 0, "\n-- Stat=<%02x> Mask=<%02x> --", stat, mask);
@@ -109,6 +100,15 @@  static irqreturn_t mantis_irq_handler(int irq, void *dev_id)
  	}
  	if (stat & MANTIS_INT_IRQ0) {
  		dprintk(MANTIS_DEBUG, 0, "<%s>", label[1]);
+		rst_mask  = MANTIS_GPIF_WRACK  |
+			    MANTIS_GPIF_OTHERR |
+			    MANTIS_SBUF_WSTO   |
+			    MANTIS_GPIF_EXTIRQ;
+
+		rst_stat  = mmread(MANTIS_GPIF_STATUS);
+		rst_stat &= rst_mask;
+		mmwrite(rst_stat, MANTIS_GPIF_STATUS);
+
  		mantis->gpif_status = rst_stat;
  		wake_up(&ca->hif_write_wq);
  		schedule_work(&ca->hif_evm_work);