TextBoxes icon indicating copy to clipboard operation
TextBoxes copied to clipboard

Get confidence score of CRNN to regularize the detection outputs of textBoxes.

Open ahmedmazari-dhatim opened this issue 8 years ago • 6 comments

Hello, let me first thank you about these excellent articles : textboxes+CRNN. In first page of textbox paper. It's mentionned the following : "we use the confidence scores of CRNN to regularize the detection outputs of TextBoxes"

However, l get stuck at getting the probability of the sequence outputted by CRNN

from example :

--h-e--ll-oo- => 'hello' with a probability= 0.89 for instance how can l get that ? l'm using the pytorch version.

in the code CTCLoss can't find these probabilities .

In __init__.py the CTC class is defined as follow :

However l don't find where to print the output probabilities here.

class _CTC(Function):
    def forward(self, acts, labels, act_lens, label_lens):
        is_cuda = True if acts.is_cuda else False
        acts = acts.contiguous()
        loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc
        grads = torch.zeros(acts.size()).type_as(acts)
        minibatch_size = acts.size(1)
        costs = torch.zeros(minibatch_size)
        loss_func(acts,
                  grads,
                  labels,
                  label_lens,
                  act_lens,
                  minibatch_size,
                  costs)
        self.grads = grads
        self.costs = torch.FloatTensor([costs.sum()])
        return self.costs

    def backward(self, grad_output):
        return self.grads, None, None, None


class CTCLoss(Module):
    def __init__(self):
        super(CTCLoss, self).__init__()

    def forward(self, acts, labels, act_lens, label_lens):
        """
        acts: Tensor of (seqLength x batch x outputDim) containing output from network
        labels: 1 dimensional Tensor containing all the targets of the batch in one sequence
        act_lens: Tensor of size (batch) containing size of each output sequence from the network
        act_lens: Tensor of (batch) containing label length of each example
        """
        _assert_no_grad(labels)
        _assert_no_grad(act_lens)
        _assert_no_grad(label_lens)
        return _CTC()(acts, labels, act_lens, label_lens)


Thank you

ahmedmazari-dhatim avatar Jul 04 '17 14:07 ahmedmazari-dhatim

We modified the CRNN codes to output the probability. You can refer to this paper: http://www.machinelearning.org/proceedings/icml2006/047_Connectionist_Tempor.pdf

MhLiao avatar Jul 06 '17 10:07 MhLiao

Hi @MhLiao,

Thank you a lot for your answer. However l don't find where output the probabilities in CRNN, do you mind tell me where can l print them?

Thank you @MhLiao

ahmedmazari-dhatim avatar Jul 10 '17 16:07 ahmedmazari-dhatim

@ahmedmazari-dhatim You can refer to the Equation. 14 in the given paper which describes the CTC to get the probability. There is a variable named logProb in "crnn/src/cpp/ctc.cpp", you can get the score by an "exp" operation.

MhLiao avatar Jul 11 '17 08:07 MhLiao

Hello @MhLiao ,

Thank you a lot for your answer. However, l'm using Pytorch version this is why l'm asking the question. l don"t have access from pytorch version of CRNN to crnn/src/cpp/ctc.cpp.

in crrn_main.py l have the following : criterion = CTCLoss() such that CTCLoss() is :

import torch
import warpctc_pytorch as warp_ctc
from torch.autograd import Function
from torch.nn import Module
from torch.nn.modules.loss import _assert_no_grad
from torch.utils.ffi import _wrap_function
from ._warp_ctc import lib as _lib, ffi as _ffi

__all__ = []


def _import_symbols(locals):
    for symbol in dir(_lib):
        fn = getattr(_lib, symbol)
        locals[symbol] = _wrap_function(fn, _ffi)
        __all__.append(symbol)


_import_symbols(locals())


class _CTC(Function):
    def forward(self, acts, labels, act_lens, label_lens):
        is_cuda = True if acts.is_cuda else False
        acts = acts.contiguous()
        loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc
        grads = torch.zeros(acts.size()).type_as(acts)
        minibatch_size = acts.size(1)
        costs = torch.zeros(minibatch_size)
        loss_func(acts,
                  grads,
                  labels,
                  label_lens,
                  act_lens,
                  minibatch_size,
                  costs)
        self.grads = grads
        self.costs = torch.FloatTensor([costs.sum()])
        return self.costs

    def backward(self, grad_output):
        return self.grads, None, None, None


class CTCLoss(Module):
    def __init__(self):
        super(CTCLoss, self).__init__()

    def forward(self, acts, labels, act_lens, label_lens):
        """
        acts: Tensor of (seqLength x batch x outputDim) containing output from network
        labels: 1 dimensional Tensor containing all the targets of the batch in one sequence
        act_lens: Tensor of size (batch) containing size of each output sequence from the network
        act_lens: Tensor of (batch) containing label length of each example
        """
        _assert_no_grad(labels)
        _assert_no_grad(act_lens)
        _assert_no_grad(label_lens)
        return _CTC()(acts, labels, act_lens, label_lens)



l'm wondering if there is a way from the pytorch version to get the probabilities as you suggested . Any idea @MhLiao to get that ?

in the according line to get the probabilities is line 80

// compute log-likelihood
T logProb = fvars.at({inputLength-1, nSegment-1});

Thank you again

ahmedmazari-dhatim avatar Jul 11 '17 15:07 ahmedmazari-dhatim

@ahmedmazari-dhatim I am sorry that I did not read the py-torch code. But I guess the py-torch code also utilize the CTC-wrap which is written in C++.

MhLiao avatar Jul 17 '17 12:07 MhLiao

Hi @MhLiao , Yes but can't find how to access cpp/ctc.cpp from pytorch version.

We have only this class in pytorch CTCLOSS()

CTCLoss() is :


import torch
import warpctc_pytorch as warp_ctc
from torch.autograd import Function
from torch.nn import Module
from torch.nn.modules.loss import _assert_no_grad
from torch.utils.ffi import _wrap_function
from ._warp_ctc import lib as _lib, ffi as _ffi

__all__ = []


def _import_symbols(locals):
    for symbol in dir(_lib):
        fn = getattr(_lib, symbol)
        locals[symbol] = _wrap_function(fn, _ffi)
        __all__.append(symbol)


_import_symbols(locals())


class _CTC(Function):
    def forward(self, acts, labels, act_lens, label_lens):
        is_cuda = True if acts.is_cuda else False
        acts = acts.contiguous()
        loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc
        grads = torch.zeros(acts.size()).type_as(acts)
        minibatch_size = acts.size(1)
        costs = torch.zeros(minibatch_size)
        loss_func(acts,
                  grads,
                  labels,
                  label_lens,
                  act_lens,
                  minibatch_size,
                  costs)
        self.grads = grads
        self.costs = torch.FloatTensor([costs.sum()])
        return self.costs

    def backward(self, grad_output):
        return self.grads, None, None, None


class CTCLoss(Module):
    def __init__(self):
        super(CTCLoss, self).__init__()

    def forward(self, acts, labels, act_lens, label_lens):
        """
        acts: Tensor of (seqLength x batch x outputDim) containing output from network
        labels: 1 dimensional Tensor containing all the targets of the batch in one sequence
        act_lens: Tensor of size (batch) containing size of each output sequence from the network
        act_lens: Tensor of (batch) containing label length of each example
        """
        _assert_no_grad(labels)
        _assert_no_grad(act_lens)
        _assert_no_grad(label_lens)
        return _CTC()(acts, labels, act_lens, label_lens)


ahmedmazari-dhatim avatar Jul 17 '17 12:07 ahmedmazari-dhatim