slither icon indicating copy to clipboard operation
slither copied to clipboard

Repeatable Sections

Open russ1985 opened this issue 13 years ago • 12 comments

Added ability to handle repeating sections.

russ1985 avatar Aug 06 '12 14:08 russ1985

Hi russ1985, I am going to be taking charge of approving pull requests. I'll be merging in my changes in the next couple of days, which might generate a couple of minor conflicts for you. Sorry, hope it's nothing too bad. Also, could you add a couple tests? We are using Rspec. This ensures that everything works, as well as provides everyone with examples of how to use your features.

sberkley avatar Aug 06 '12 20:08 sberkley

Yea, after I did the pull request I just thought "shouldn't I have added some specs". Sorry about that, will add some. I added a section in the Readme on how to use the feature.

On 8/6/12 4:36 PM, Scott Berkley wrote:

Hi russ1985, I am going to be taking charge of approving pull requests. I'll be merging in my changes in the next couple of days, which might generate a couple of minor conflicts for you. Sorry, hope it's nothing too bad. Also, could you add a couple tests? We are using Rspec. This ensures that everything works, as well as provides everyone with examples of how to use your features.

— Reply to this email directly or view it on GitHub https://github.com/ryanwood/slither/pull/10#issuecomment-7534983.

russ1985 avatar Aug 06 '12 21:08 russ1985

I guess I'm also a bit confused about what the purpose of this change is. The code currently allows repeated sections. Is it important for your application that they get put into separate namespaces in the resulting parse object?

sberkley avatar Aug 07 '12 17:08 sberkley

I must have missed that, from what I saw it did not support repeating sections. How do you specify repeating sections?

On 8/7/12 1:07 PM, Scott Berkley wrote:

I guess I'm also a bit confused about what the purpose of this change is. The code currently allows repeated sections. Is it important for your application that they get put into separate namespaces in the resulting parse object?

— Reply to this email directly or view it on GitHub https://github.com/ryanwood/slither/pull/10#issuecomment-7559950.

russ1985 avatar Aug 07 '12 17:08 russ1985

parser.rb line 8

 # This may be used in the future for non-linear or repeating sections
@mode = :linear

russ1985 avatar Aug 07 '12 17:08 russ1985

No special specification is needed. Sections are assigned by the 'trap' statement. So I can define my header section as a line that starts with 'HEAD'. Now if I have this file:

HEAD1234
body  747
more  787
HEAD5678

I will have two items in the header section, even though they don't come sequentially. They just get added to the array for that section, the same way that body lines do.

I'm not really sure what the intention of that mode comment is. Give it a try and let me know if it doesn't do what you need it to.

sberkley avatar Aug 07 '12 17:08 sberkley

I might be confused then.. from what I see in the code it reads the sections linearly so if you have a definition of

# Create a Slither::Defintion to describe a file format
Slither.define :simple do |d|
  d.header do |header|      
    header.trap { |line| line[0,1] == 1 }
    header.column :header_begin, 1, :type => :integer
    header.column :batch_number, 3, :type => :integer
  end
  d.data do |data|
    data.trap { |line| line[0,1] == 2 }
    data.column :data_begin, 1, :type => :integer
    data.column :record_number, 3, :type => :integer
    data.column :record_number_plus_batch, 6, :type => :integer
    data.column :id, 2, :type => :integer
    data.column :name, 10, :align => :left
  end
  d.tail_record do |tail_record|
    tail_record.trap { |line| line[0,1] == 3 }
    tail_record.column :tail_record_begin, 1, :type => :integer
    tail_record.column :num_records, 3, :type => :integer
  end
  d.footer do |footer|
    footer.trap { |line| line[0,1] == 4 }
    footer.column :footer_record_begin, 1, :type => :integer
    footer.column :total_record_count, 3, :type => :integer
    footer.column :batch_number, 3, :type => :integer
  end
end

and a file of

1001
200100100101Russell    
200200200102Jim
200300300103Bill
303
200100100104Tim    
200200200105Josh
3002
4005001

It would raise

Slither::RequiredSectionNotFoundError

because it would find another data record instead of a footer record.

make sense? I hope I am not confusing you more. I only added my repeatable sections because I ran into the problem above and kept getting

Slither::RequiredSectionNotFoundError

It might seem odd, but I have some requirements for something like that.

russ1985 avatar Aug 07 '12 18:08 russ1985

Oh, you're right, sorry. It turns out that we added repeatable sections in my change without my realizing it. We only deal with one section, so these issues are not on the top of my mind. My coworker changed how lines get added to sections to support parsing from IO objects, then I changed how the required section validation works because I saw that it didn't make sense any longer. It now parses all lines into sections, then checks to make sure all required sections have at least one row at the end. So the order of definition doesn't matter.

I did the merge on my stuff, so try pulling the latest and see if it works for you. If you have sections of different lengths like in your example, you'll have to change the definition to Slither.define :simple, :by_bytes => false do |d|. Sorry for the confusion.

sberkley avatar Aug 07 '12 18:08 sberkley

I gave your example a try and might have fixed some things that were causing it to fail.

Slither.define :repeat, :by_bytes => false do |d|
  d.header do |header|      
    header.trap { |line| line[0] == '1' }
    header.column :header_begin, 1, :type => :integer
    header.column :batch_number, 3, :type => :integer
  end
  d.data do |data|
    data.trap { |line| line[0] == '2' }
    data.column :data_begin, 1, :type => :integer
    data.column :record_number, 3, :type => :integer
    data.column :record_number_plus_batch, 6, :type => :integer
    data.column :id, 2, :type => :integer
    data.column :name, 10, :align => :left
  end
  d.tail_record do |tail_record|
    tail_record.trap { |line| line[0] == '3' }
    tail_record.column :tail_record_begin, 1, :type => :integer
    tail_record.column :num_records, 3, :type => :integer
  end
  d.footer do |footer|
    footer.trap { |line| line[0] == '4' }
    footer.column :footer_record_begin, 1, :type => :integer
    footer.column :total_record_count, 3, :type => :integer
    footer.column :batch_number, 3, :type => :integer
  end
end

Gives me

{:header=>[{:header_begin=>1, :batch_number=>1}], :data=>[{:data_begin=>2, :record_number=>1, :record_number_plus_batch=>1001, :id=>1, :name=>"Russell"}, {:data_begin=>2, :record_number=>2, :record_number_plus_batch=>2001, :id=>2, :name=>"Jim"}, {:data_begin=>2, :record_number=>3, :record_number_plus_batch=>3001, :id=>3, :name=>"Bill"}, {:data_begin=>2, :record_number=>1, :record_number_plus_batch=>1001, :id=>4, :name=>"Tim"}, {:data_begin=>2, :record_number=>2, :record_number_plus_batch=>2001, :id=>5, :name=>"Josh"}], :tail_record=>[{:tail_record_begin=>3, :num_records=>3}, {:tail_record_begin=>3, :num_records=>2}], :footer=>[{:footer_record_begin=>4, :total_record_count=>5, :batch_number=>1}]}

Note that we also added length validation, so make sure each line is padded with spaces to make it the right length. Does this output work for you?

sberkley avatar Aug 08 '12 14:08 sberkley

That looks good to me, one question though, how will it write the file out? Wit it write the data pieces all together? Or will it separate them like in my example?

Thanks!

On 8/8/12 10:05 AM, Scott Berkley wrote:

I gave your example a try and might have fixed some things that were causing it to fail.

|Slither.define :repeat, :by_bytes => false do |d| d.header do |header| header.trap { |line| line[0] == '1' } header.column :header_begin, 1, :type => :integer header.column :batch_number, 3, :type => :integer end d.data do |data| data.trap { |line| line[0] == '2' } data.column :data_begin, 1, :type => :integer data.column :record_number, 3, :type => :integer data.column :record_number_plus_batch, 6, :type => :integer data.column :id, 2, :type => :integer data.column :name, 10, :align => :left end d.tail_record do |tail_record| tail_record.trap { |line| line[0] == '3' } tail_record.column :tail_record_begin, 1, :type => :integer tail_record.column :num_records, 3, :type => :integer end d.footer do |footer| footer.trap { |line| line[0] == '4' } footer.column :footer_record_begin, 1, :type => :integer footer.column :total_record_count, 3, :type => :integer footer.column :batch_number, 3, :type => :integer end end |

Gives me

{:header=>[{:header_begin=>1, :batch_number=>1}], :data=>[{:data_begin=>2, :record_number=>1, :record_number_plus_batch=>1001, :id=>1, :name=>"Russell"}, {:data_begin=>2, :record_number=>2, :record_number_plus_batch=>2001, :id=>2, :name=>"Jim"}, {:data_begin=>2, :record_number=>3, :record_number_plus_batch=>3001, :id=>3, :name=>"Bill"}, {:data_begin=>2, :record_number=>1, :record_number_plus_batch=>1001, :id=>4, :name=>"Tim"}, {:data_begin=>2, :record_number=>2, :record_number_plus_batch=>2001, :id=>5, :name=>"Josh"}], :tail_record=>[{:tail_record_begin=>3, :num_records=>3}, {:tail_record_begin=>3, :num_records=>2}], :footer=>[{:footer_record_begin=>4, :total_record_count=>5, :batch_number=>1}]}

Note that we also added length validation, so make sure each line is padded with spaces to make it the right length. Does this output work for you?

— Reply to this email directly or view it on GitHub https://github.com/ryanwood/slither/pull/10#issuecomment-7584434.

russ1985 avatar Aug 08 '12 14:08 russ1985

I'll let you test that out. I would guess it will write all headers, all data, all tails, and then all footers, so probably not what you want. If that's the case, we can figure out a solution.

sberkley avatar Aug 08 '12 18:08 sberkley

I wrote some specs to test my changes, you can see them in my commit. I will merge your changes into mine and see if I can get it working with your changes.

thanks!

On Wed, Aug 8, 2012 at 2:22 PM, Scott Berkley [email protected]:

I'll let you test that out. I would guess it will write all headers, all data, all tails, and then all footers, so probably not what you want. If that's the case, we can figure out a solution.

— Reply to this email directly or view it on GitHubhttps://github.com/ryanwood/slither/pull/10#issuecomment-7592455.

russ1985 avatar Aug 09 '12 01:08 russ1985