nmaptocsv Better code structure to be able to parse large file (800MB+)

Mar 27 '18 19:03 maaaaz

Thanks for you amazing work! For me the script doesn't handles files with more than ~30k fines.

I made a quick and dirty script to handle the issue.

#!/bin/bash

if [ "$#" -ne 2 ]; then
 echo """
///// The output file will be erased /////

nmaptocsv doesn't handle larges files
nmap sometimes generate larges files that contain lots of tcpwrapped and unknown services that are false positive

This script :
- Remove the tcpwrapped and unkown results
- Split the large gnmap file by host
- send all the generated chunks of original file to nmaptocsv (split by host)
- nmaptoscv is configured to output coma separated output fqdn,port,protocol,service,version
- remove the \" char and empty fines!

use : $0 <.gnmap input file> <resultoutput file>

 """
exit
fi

output=$2
input=$1
folder="$input.d"

mkdir $folder
awk '/Nmap scan report for/{c++} {print > FILENAME"_.temporarygnmap"c }' $input
mv *.temporarygnmap* $folder
sed -i '/.*unknown.*/d' $folder/$input_*
sed -i '/.*tcpwrapped.*/d' $folder/$input_*

echo "" > output.txt

for file in $folder/*
do
	python nmaptocsv.py -i $file -f fqdn-port-protocol-service-version -d , -s -n 2>/dev/null >> $output || echo "error on $file"
done

sed -i '/^$/d' $output
sed -i 's/"//g' $output
echo "result in $output"

Apr 08 '20 15:04 sbnsec

Thank you @sbnsec, interesting. Were you trying to parse a big gnmap or nmap file ? What was the size of these (30k+ entries) files ?

Apr 08 '20 17:04 maaaaz

Thanks @maaaaz for your reply ! it's a nmap file, it's size is 745K.

$ cat sample.nmap | head
Nmap scan report for XXXXXXX (X.X.X.X)
Host is up (0.18s latency).
Not shown: 23463 filtered ports
PORT      STATE SERVICE             VERSION
2/tcp     open  compressnet?
3/tcp     open  compressnet?
5/tcp     open  rje?
6/tcp     open  unknown
7/tcp     open  echo?
21/tcp    open  ftp                 Pure-FTPd
$ sample.nmap | wc -l
31472
$ python nmaptocsv.py -i sample.nmap -f fqdn-port-protocol-service-version -d , -s -n

When I try to time it, the process is killed. 2.25s user 24.65s system 18% cpu 2:57.72 total

Apr 09 '20 03:04 sbnsec

@sbnsec, by chance, would you mind sending me your file (with all due anonymization of course) ? or do you know some places I could find some (I tried unsuccessfully) ?

I need some big-file examples to fix this long-time requested feature !

Of course I could fake a large file by copying multiple times the same results, but I don't think it will produce a good parsing as real results bring some unexpected subtleties.

Apr 12 '20 22:04 maaaaz

nmaptocsv nmaptocsv copied to clipboard

Better code structure to be able to parse large file (800MB+)

nmaptocsv
nmaptocsv copied to clipboard