Comment on page


Generally speaking the best and easiest way for parsing HTML and XML is using Nokogiri library
  • To install Nokogiri
    gem install nokogiri


Here we'll use nokogiri to list our contents list from

Using CSS selectors

require 'nokogiri'
require 'open-uri'
page = Nokogiri::HTML(open(""))
page.css(".book .book-summary ul.summary li a, .book .book-summary ul.summary li span").each { |css| puts css.text.strip.squeeze.gsub("\n", '')}
Module 0x0 | Introduction
0.1. Contribution
0.2. Beginners
0.3. Required Gems
1. Module 0x1 | Basic Ruby Kung Fu
1.1. String
1.1.1. Conversion
1.1.2. Extraction
1.2. Array
2. Module 0x2 | System Kung Fu
2.1. Command Execution
2.2. File manipulation
2.2.1. Parsing HTML, XML, JSON
2.3. Cryptography
2.4. Remote Shell
2.4.1. Ncat.rb
2.5. VirusTotal
3. Module 0x3 | Network Kung Fu
3.1. Ruby Socket
3.2. FTP
3.3. SSH
3.4. Email
3.4.1. SMTP Enumeration
3.5. Network Scanning


There are 2 ways we'd like to show here, the standard library rexml and nokogiri external library
We've the following XML file
<?xml version="1.0"?>
<collection shelf="New Arrivals">
<movie title="Enemy Behind">
<type>War, Thriller</type>
<description>Talk about a US-Japan war</description>
<movie title="Transformers">
<type>Anime, Science Fiction</type>
<description>A scientific fiction</description>
<movie title="Trigun">
<type>Anime, Action</type>
<description>Vash the Stampede!</description>
<movie title="Ishtar">
<description>Viewable boredom</description>


require 'rexml/document'
include REXML
file = "file.xml"
xmldoc =
# Get the root element
root = xmldoc.root
puts "Root element : " + root.attributes["shelf"]
# List of movie titles.
xmldoc.elements.each("collection/movie") do |e|
puts "Movie Title : " + e.attributes["title"]
# List of movie types.
xmldoc.elements.each("collection/movie/type") do |e|
puts "Movie Type : " + e.text
# List of movie description.
xmldoc.elements.each("collection/movie/description") do |e|
puts "Movie Description : " + e.text
# List of movie stars
xmldoc.elements.each("collection/movie/stars") do |e|
puts "Movie Stars : " + e.text


require 'nokogiri'


require 'nokogiri'
# Parse XML file
doc = Nokogiri::Slop file
puts"type").map {|f| t.text} # List of Types
puts"format").map {|f| f.text} # List of Formats
puts"year").map {|y| y.text} # List of Year
puts"rating").map {|r| r.text} # List of Rating
puts"stars").map {|s| s.text} # List of Stars"description").map {|d| d.text} # List of Descriptions


Assume you have a small vulnerability database in a json file like follows
"name": "SQLi",
"full_name": "SQL injection",
"description": "An injection attack wherein an attacker can execute malicious SQL statements",
"references": [
"type": "web"
To parse it
require 'json'
vuln_json = JSON.parse('vulnerabilities.json'))
Returns a hash
{"full_name"=>"SQL injection",
"description"=>"An injection attack wherein an attacker can execute malicious SQL statements",
"references"=>["", ""],
Now you can retrieve and data as you do with hash
vuln_json["Vulnerability"].each {|vuln| puts vuln['name']}
If you want to add to this database, just create a hash with the same struction.
xss = {"name"=>"XSS", "details:"=>{"full_name"=>"Corss Site Scripting", "description"=>" is a type of computer security vulnerability typically found in web applications", "references"=>["", ""], "type"=>"web"}}
You can convert it to json just by using `.to_json` method