nnpdf icon indicating copy to clipboard operation
nnpdf copied to clipboard

Polarised Jet commondata implementation

Open toonhasenack opened this issue 1 year ago • 11 comments

Here we provide the implementation of the polarized Jet dataset.

Datasets marked with ✅ have been implemented and checked while those with ❌ are removed and no longer are part of the implementation.

1.1 datasets

Datasets Obs. Correlation Status Comments Inspire HepData
PHENIX_1JET_200GEV_ALL $A_{LL}$ no Ready Table 18 paper dataset
STAR_2005_1JET_200GEV_ALL $A_{LL}$ no Ready Figure 14 paper dataset
STAR_2006_1JET_200GEV_ALL $A_{LL}$ no Ready Figure 15 paper dataset
STAR_2009_1JET_200GEV_ALL $A_{LL}$ Correlations Ready Table 3, 4 paper dataset

New datasets

Datasets Obs. Correlation Status Comments Inspire HepData
STAR_2009_2JET_SS_200GEV_ALL STAR_2009_2JET_OS_200GEV_ALL $A_{LL}$ Correlations Ready Tables 7, 9 paper dataset
STAR_2009_2JET_A_200GEV_ALL STAR_2009_2JET_B_200GEV_ALL STAR_2009_2JET_C_200GEV_ALL $A_{LL}$ Correlations Ready Figure 9 (3 topologies) paper dataset
STAR_2012_1JET_510GEV_ALL $A_{LL}$ ✅ see paper appendix Ready Figure 12 paper dataset
STAR_2012_2JET_A_510GEV_ALL STAR_2012_2JET_B_510GEV_ALL STAR_2012_2JET_C_510GEV_ALL, STAR_2012_2JET_D_510GEV_ALL $A_{LL}$ ✅ see paper appendix Ready Figure 14 (4 topologies) paper dataset
STAR_2015_1JET_200GEV_ALL $A_{LL}$ ✅ Tabs 4-13 Ready Table 1 paper dataset
STAR_2015_2JET_MIDRAP_SS_200GEV_ALL STAR_2015_2JET_MIDRAP_OS_200GEV_ALL $A_{LL}$ ✅ Tabs 4-13 Ready Table 2 (top, bottom), with correlated with 1JET paper dataset
STAR_2013_1JET_510GEV_ALL $A_{LL}$ ✅ HepData Ready Figure 3 paper dataset
STAR_2013_2JET_A_510GEV_ALL STAR_2013_2JET_B_510GEV_ALL STAR_2013_2JET_C_510GEV_ALL STAR_2013_2JET_D_510GEV_ALL $A_{LL}$ ✅ HepData Ready Figure 5 paper dataset

toonhasenack avatar Apr 04 '24 13:04 toonhasenack

Please @giacomomagni, add sys.path.append('../../') to all the files in which you changed to symmetrize_error import. Otherwise the filter.py's don't work

toonhasenack avatar Apr 23 '24 11:04 toonhasenack

Please @giacomomagni, add sys.path.append('../../') to all the files in which you changed to symmetrize_error import. Otherwise the filter.py's don't work

weird to me they work as they are right now...

EDIT: you should change the way you are importing as:

from nnpdf_data.filter_utils.correlations import compute_covmat

this way it would work without appending the path

giacomomagni avatar Apr 23 '24 13:04 giacomomagni

@giacomomagni, I tried your implementation and it didn't work. I suggest we do it with the import path, since then for everyone it works.

toonhasenack avatar Apr 24 '24 08:04 toonhasenack

@giacomomagni, I tried your implementation and it didn't work. I suggest we do it with the import path, since then for everyone it works.

I believe you have to install nnpdf_data, so just go to nnpdf/nnpdf_data install the package with develop mode and you should be fine. I understand appending ../../ works, but it's not the proper fix.

giacomomagni avatar Apr 24 '24 08:04 giacomomagni

In principle it should also work by going to the root of the repository and installing there. If it doesn't we should fix it.t

It might be however that @toonhasenack installed before some changes to nnpdf_data (and the way the develop mode works not all changes can be propagated).

The layout of nnpdf_data is still under development (c.f. https://github.com/NNPDF/nnpdf/pull/2056) so apologies for the rough edges.

scarlehoff avatar Apr 24 '24 08:04 scarlehoff

Hi @Radonirinaunimi and @enocera, some work is still need to fix correlations in STAR_2009_**, but since the number of files changed is quite huge and the procedure is rather similar for all the (Inclusive, dijet) datasets pairs, you can start having a look.

In particular, please double check that our understanding of the correlations is correct, thanks in advance.

giacomomagni avatar Apr 24 '24 14:04 giacomomagni

Hi @Radonirinaunimi and @enocera, some work is still need to fix correlations in STAR_2009_**, but since the number of files changed is quite huge and the procedure is rather similar for all the (Inclusive, dijet) datasets pairs, you can start having a look.

In particular, please double check that our understanding of the correlations is correct, thanks in advance.

Thanks a lot both! I will try to look asap.

Radonirinaunimi avatar Apr 24 '24 20:04 Radonirinaunimi

So, the implementation of the correlations from the correlation matrices are fine. In this sense, datasets-wise, everything is good to go AFAICT.

Re the changes that should not be part of this PR. I can take slowly take care of them.

Radonirinaunimi avatar May 14 '24 06:05 Radonirinaunimi

@enocera I think I've addressed all your comments except for one. It looks to me that in some dataset the dijets stat are correlated and in other no. So not sure what is the best option.

giacomomagni avatar May 17 '24 13:05 giacomomagni

@enocera I think I've addressed all your comments except for one. It looks to me that in some dataset the dijets stat are correlated and in other no. So not sure what is the best options.

Thanks @giacomomagni . This is not 100% clear from the papers indeed. I am planning to write to Elke Aschenauer and ask her what we should do with the correlations in the STAR data. I will also check with her that we are not missing any important measurement.

enocera avatar May 17 '24 14:05 enocera

Thank you for taking care of this!

giacomomagni avatar May 17 '24 16:05 giacomomagni

@giacomomagni I talked to Elke Aschenauer about the set of data that we consider and the treatment of uncertainties. Elke is saying two things.

  1. There are two additional papers on single inclusive jets that we may want to consider. These are arXiv:0710.2048 and hep-ex/0608030. These are rather old and measurements have rather large uncertainties (this is the reason why they were not considered in NNPDFpol1.1). If it is not too much effort, we may consider to include them.
  2. Concerning correlations, she confirmed that the recipe to be used is the following (for all of the jet and dijet data). One takes the correlation matrix and multiplies each entry by a total uncertainty, which is the sum in quadrature of the statistical and systematic uncertainties (except special uncs, like the pol. beam uncertainty, which we already treat separately). So, in other words: the provided correlation matrix already takes into account the fact that statistical uncertainties in dijets are uncorrelated. Therefore, also for dijets one has to multiply the corr mat. entries by the sum in quadrature of stat. and sys. uncs.

enocera avatar Jul 02 '24 05:07 enocera

Hi @enocera, thank you for your message.

  • Point 2. I'm happy to implement the prescription suggested by Elke Aschenauer. I'll do it asap and get back to you.
  • Point 1. I think we can include the data from hep-ex/0608030 which were taken in 2003-2004. Regarding the other reference arXiv:0710.2048, I was thinking if the dataset STAR_2005_1JET_200GEV_ALL arXiv:1205.2735 contains already that information in an updated version. Both papers say that data were recored in 2005 with a luminosity of 2.1 pb^-1. Could you plese confirm?

giacomomagni avatar Jul 02 '24 20:07 giacomomagni

Please, avoid merging PRs when the tests are still not passing...

scarlehoff avatar Jul 17 '24 13:07 scarlehoff