tf-lm icon indicating copy to clipboard operation
tf-lm copied to clipboard

How to use it for for computing sentence probability

Open abubakar-ucr opened this issue 6 years ago • 4 comments

Hi, How to use it for computing sentence probability, which is the main functionality of language modeling. For example, for a given sentence, s, I want to know that what is the probability that sentence 's' belongs to the language, it has been trained on.

abubakar-ucr avatar Nov 05 '18 22:11 abubakar-ucr

Hi,

You can use the 'rescore' option for this: add 'rescore' to your configuration file, and as value you can give it the path to a file containing the sentences you want to rescore. The output will be log probabilities per sentence, in the 'result' file.

Good luck! Lyan

Van: "Abu Bakar Siddique" [email protected] Aan: "lverwimp/tf-lm" [email protected] Cc: "Subscribed" [email protected] Verzonden: Maandag 5 november 2018 23:31:26 Onderwerp: [lverwimp/tf-lm] How to use it for for computing sentence probability (#5)

Hi, How to use it for computing sentence probability, which is the main functionality of language modeling. For example, for a given sentence, s, I want to know that what is the probability that sentence 's' belongs to the language, it has been trained on.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, [ https://github.com/lverwimp/tf-lm/issues/5 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/APANLEwUdspRoTTCURBCeZmy8TbImoKAks5usLw-gaJpZM4YPY4f | mute the thread ] .

lverwimp avatar Nov 09 '18 07:11 lverwimp

Thanks for your reply. Can you provide me a sample command to run? Or sample configuration file. I am confused where to pass that rescore?

On Thu, Nov 8, 2018, 11:42 PM lverwimp <[email protected] wrote:

Hi,

You can use the 'rescore' option for this: add 'rescore' to your configuration file, and as value you can give it the path to a file containing the sentences you want to rescore. The output will be log probabilities per sentence, in the 'result' file.

Good luck! Lyan

Van: "Abu Bakar Siddique" [email protected] Aan: "lverwimp/tf-lm" [email protected] Cc: "Subscribed" [email protected] Verzonden: Maandag 5 november 2018 23:31:26 Onderwerp: [lverwimp/tf-lm] How to use it for for computing sentence probability (#5)

Hi, How to use it for computing sentence probability, which is the main functionality of language modeling. For example, for a given sentence, s, I want to know that what is the probability that sentence 's' belongs to the language, it has been trained on.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, [ https://github.com/lverwimp/tf-lm/issues/5 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/APANLEwUdspRoTTCURBCeZmy8TbImoKAks5usLw-gaJpZM4YPY4f | mute the thread ] .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lverwimp/tf-lm/issues/5#issuecomment-437276410, or mute the thread https://github.com/notifications/unsubscribe-auth/AldTG7lt1ff0xvATn2l-xFXrGjBXGKFAks5utTHSgaJpZM4YPY4f .

abubakar-ucr avatar Nov 09 '18 07:11 abubakar-ucr

[ https://github.com/lverwimp/tf-lm/blob/master/config/en-ptb_word_discourse_rescore.config ] There is an example on Github: en-ptb_word_discourse_rescore.config

You need to change the value of 'True' after rescore to a filename, and than that file will be rescored instead of the default test file.

Van: "Abu Bakar Siddique" [email protected] Aan: "lverwimp/tf-lm" [email protected] Cc: "lyan" [email protected], "Comment" [email protected] Verzonden: Vrijdag 9 november 2018 08:45:07 Onderwerp: Re: [lverwimp/tf-lm] How to use it for for computing sentence probability (#5)

Thanks for your reply. Can you provide me a sample command to run? Or sample configuration file. I am confused where to pass that rescore?

On Thu, Nov 8, 2018, 11:42 PM lverwimp <[email protected] wrote:

Hi,

You can use the 'rescore' option for this: add 'rescore' to your configuration file, and as value you can give it the path to a file containing the sentences you want to rescore. The output will be log probabilities per sentence, in the 'result' file.

Good luck! Lyan

Van: "Abu Bakar Siddique" [email protected] Aan: "lverwimp/tf-lm" [email protected] Cc: "Subscribed" [email protected] Verzonden: Maandag 5 november 2018 23:31:26 Onderwerp: [lverwimp/tf-lm] How to use it for for computing sentence probability (#5)

Hi, How to use it for computing sentence probability, which is the main functionality of language modeling. For example, for a given sentence, s, I want to know that what is the probability that sentence 's' belongs to the language, it has been trained on.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, [ https://github.com/lverwimp/tf-lm/issues/5 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/APANLEwUdspRoTTCURBCeZmy8TbImoKAks5usLw-gaJpZM4YPY4f | mute the thread ] .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lverwimp/tf-lm/issues/5#issuecomment-437276410, or mute the thread https://github.com/notifications/unsubscribe-auth/AldTG7lt1ff0xvATn2l-xFXrGjBXGKFAks5utTHSgaJpZM4YPY4f .

— You are receiving this because you commented. Reply to this email directly, [ https://github.com/lverwimp/tf-lm/issues/5#issuecomment-437276998 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/APANLHxywkklzhiFEObizaywmjoHPw_4ks5utTKDgaJpZM4YPY4f | mute the thread ] .

lverwimp avatar Nov 09 '18 07:11 lverwimp

If I am using a pre-trained model downloaded from your web, would I still need the dataset (wiki or ptb)?

abubakar-ucr avatar Nov 09 '18 20:11 abubakar-ucr