Welcome to CPS 491 - TEAM 7
Efficient File Content Searching at Scale - GE Aviation
nolan

Nolan Hollingsworth

Computer Information Systems

valadezjrr1

Roberto Valadez

Computer Science

andrew

Andrew Streng

Computer Science

Dr. Phung

Dr. Phu Phung

Professor

default

Jeff Archer

GE Sponsor

About Us

For this team's project, we will be partnering with GE Aviation in order to create/use a tool that takes binary code sequence-searches samples for malware, returns ASCII/Unicode matches, as well, returns any lists of matched file paths and/or matched sequences. The context that this project is ultimately being developed for is so that GE aviation has an efficient tool to search Malware samples for a specific piece.

Technology

Our main technologies that will be focused on are:

  • Linux environment
  • Python
  • Database of Malware
  • N-gram

Project Scope

The Efficient File Search project will include searching a fil directory of malware samples, of varying file types, for occurrences in which are similar to the user input. The user input will be binary code sequences, hexadecimal bytes, or a string. Once a similar occurrence is found, it will return the file path, the MD5 hash of the fille, and the virtual address of the result. The project was complete in April 2021 by Andrew Streng, Roberto Valadez, and Nolan Hollingsworth with the assistance and oversight of Jeff Archer at GE Aviation for a senior capstone at the University of Dayton. At the end, the group delivered a finished product, written in python, that is compatible with Linux and will assist GE Aviation in its operations of searching through their malware samples.

Demo Video