GraphSPD
GraphSPD represents patches as graphs with richer semantics and utilizes a patch-tailored graph model for detection.
With the increasing popularity of open-source software, embedded vulnerabilities have been widely propagating to downstream software. Due to different maintenance policies, software vendors may silently release security patches without providing sufficient advisories (e.g., CVE). This leaves users unaware of security patches and provides attackers good chances to exploit unpatched vulnerabilities. Thus, detecting those silent security patches becomes imperative for secure software maintenance.
We design a graph neural network based security patch detection system named GraphSPD, which represents patches as graphs with richer semantics and utilizes a patch-tailored graph model for detection.
More details about GraphSPD can be found in the paper “GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics”, appeared in the 44th IEEE Symposium on Security and Privacy (IEEE S&P 2023), San Francisco, CA, May 22-26, 2023.
Download
You can download the source code of GraphSPD via GitHub: https://github.com/SunLab-GMU/GraphSPD.
If you are using GraphSPD for work that will result in a publication (thesis, dissertation, paper, article), please use the following citation.
@inproceedings{wang2022graphspd,
title = {GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics},
author = {Shu Wang and Xinda Wang and Kun Sun and Sushil Jajodia and Haining Wang and Qi Li},
booktitle = {2023 IEEE Symposium on Security and Privacy (SP)},
year = {2023},
pages = {2409-2426},
doi = {10.1109/SP46215.2023.00035},
url = {https://doi.ieeecomputersociety.org/10.1109/SP46215.2023.00035},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
month = {May}
}
or
Shu Wang, Xinda Wang, Kun Sun, Sushil Jajodia, Haining Wang, and Qi Li, "GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics," 2023 44th IEEE Symposium on Security and Privacy (S&P), San Francisco, CA, US, 2023 pp. 2409-2426. doi: 10.1109/SP46215.2023.00035
Disclaimer
We developed GraphSPD and released the source code, aiming to help admins and developers identify silent security patches and prioritize their development. We hold no liability for any undesirable consequences of using the software package.
License
Copyright 2022 SunLab-GMU
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
The joern
library is based on the GitHub repository Jeorn, which is also under the Apache License 2.0 (see ./jeorn/LICENSE
for more information).
Installation
1. Install OS
Download Ubuntu 20.04.4 LTS (Focal Fossa) desktop version .iso
file and install the OS.
Note: To avoid potential version conflicts of Python and Java on existing OS, we suggest installing a new Ubuntu virtual machine on VMM (e.g., VirtualBox 5.2.24, VMware, etc.).
Suggested system configurations:
(Our package can run on pure CPU environments)
RAM: >2GB
Disk: >30GB
CPU: >1 core
2. Clone Source Code
Install git
tool if you have not intalled it.
sudo apt install git
Download the project folder into the user’s HOME
directory.
cd ~
git clone https://github.com/SunLab-GMU/GraphSPD.git
3. Install Dependencies
cd ~/GraphSPD/
chmod +x install_dep.sh
./install_dep.sh
How-to-Run
Note: All commands are executed under the main project folder: ~/GraphSPD/
.
1. Pre-processing
1.1 Use Test Patch Samples
We provide 10 patch as test examples in ~/GraphSPD/raw_patch/
.
To retrive their pre- and post-patch files, run the following commands under ~/GraphSPD/
:
python3 get_ab_file.py nginx nginx 02cca547
python3 get_ab_file.py nginx nginx 661e4086
python3 get_ab_file.py nginx nginx 9a3ec202
python3 get_ab_file.py nginx nginx dac90a4b
python3 get_ab_file.py nginx nginx fc785b12
python3 get_ab_file.py nginx nginx 60a8ed26
python3 get_ab_file.py nginx nginx bd7dad5b
python3 get_ab_file.py nginx nginx 4c5a49ce
python3 get_ab_file.py nginx nginx 71eb19da
python3 get_ab_file.py nginx nginx 56f53316
In the above, the third column refers to the owner (i.e., nginx
), the fourth column refers to the repository (i.e., nginx
), and the last column refers to the commit ID (e.g., 02cca547
).
As the result, the pre-patch and post-patch files will be stored in ~/GraphSPD/ab_file/
.
1.2 Use Your Own Patches
You can pre-process your own patches by running:
python3 get_ab_file.py [owner] [repository] [commitID]
where [owner]
, [repository]
, and [commitID]
are the owner name, repository name, and commit ID of the patch hosted on GitHub.
2. Generate PatchCPGs
To generate PatchCPGs for all the patches processed by the last step, please run the following commands under ~/GraphSPD/
:
chmod -R +x ./joern
sudo python3 gen_cpg.py
python3 merge_cpg.py
Here, gen_cpg.py
will generate two CPGs for pre- and post-patch files, respectively.
merge_cpg.py
will generate a merged PatchCPG from the two CPGs.
The output PatchCPGs will be saved in ~/GraphSPD/testdata/
.
3. Run PatchGNN
In ~/GraphSPD/
, run the command:
python3 test.py
The prediction results are saved in file ~/GraphSPD/logs/test_results.txt
.
See the results by running:
cat logs/test_results.txt
The prediction results contains the PatchCPG file path and the predictions, where 1 represents security patch and 0 represents non-security patch.
filename,prediction
./testdata/fc785b12/out_slim_ninf_noast_n1_w.log,0
./testdata/dac90a4b/out_slim_ninf_noast_n1_w.log,1
./testdata/60a8ed26/out_slim_ninf_noast_n1_w.log,1
./testdata/71eb19da/out_slim_ninf_noast_n1_w.log,0
./testdata/9a3ec202/out_slim_ninf_noast_n1_w.log,1
./testdata/bd7dad5b/out_slim_ninf_noast_n1_w.log,1
./testdata/661e4086/out_slim_ninf_noast_n1_w.log,1
./testdata/02cca547/out_slim_ninf_noast_n1_w.log,0
./testdata/4c5a49ce/out_slim_ninf_noast_n1_w.log,0
./testdata/56f53316/out_slim_ninf_noast_n1_w.log,0
Other Resources
[1] PatchDB: A Large-Scale Security Patch Dataset
[2] PatchRNN: A Deep Learning-Based System for Security Patch Identification
[3] Detecting “0-Day” Vulnerability: An Empirical Study of Secret Security Patch in OSS
Team
The GraphSPD repo is built by Sun Security Laboratory (SunLab) at George Mason University, Fairfax, VA.