Building a CLI Tool: How I Developed A Download Link Grabber Of Terabox.com
This is a CLI application to grab a direct download link from terabox.com directly from the terminal
Terabox is a website where users can upload files and share them globally. It's cloud storage. Recently I had to download a file from Terabox but when I visited the website, I found some bad video ads and non-logged users can not view or download files. Also, mobile users need to download the Terabox app which is very buggy and loaded with ads. So I decided to look for a way to download files without visiting the website or app. Thus I started working on this project.
So the first step is to intercept the request of the client and server. I like working on Burp Suite. But it can be done using Chrome developer tools also. So I started the burp proxy listener and visited the URL and got the following result.
Initially, I look into the HTML file but nothing special was found. So I look into the other intercepted URL and still got nothing exceptional.
So there's nothing to do without login in. So I created an account and logged in.
I found something interesting. Let me show you the differences. In non logged in state the client-to-server communication is limited. These are the last communication state between the client and the server
After logging in I got some new client-server entries.
So I looked into the endpoint POST /share/download and got something interesting.
{
"errno": 0,
"request_id": 556713879707242100,
"dlink": "https://d.terabox.com/file/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&dstime=1691756190&expires=1691756790&rt=sh&chkv=1&sign=FDtERVA-DCb740ccc5511e5e8fedcff06b081203-Lzf3upC%2F4sQ2B5%2BPAiALaPTdvIA%3D&r=445224630&sharesign=xAZZo/oyUpc9ZHyBQ+fg7nblMg+sZf0jzU2BzbFnAq+K1V5K7feNNpJSedsTcL4HG63Ny1W599Btewmst2cpDfeuNG+uqOdPzspCUxkt9/wO5WmSW1ZUwYSZ+qZk4xIXfkLOR2WzFOONf59SPn3ie5xwajpXstpaIQDH9QyQQe6MLHYR3QKy0yF6c7zlKz1fYFMOdPJjIUcsENawI9k2Xkhrc6hzoYnUlSTqyGlGMDguu8VsuLKCWdrooz+AoQOH1cOyH1Wi1koKFDMY+XlU1kOpxpjD8MyAukajN5lAh5k=&sh=1"
}
Now I got the download file, I need to backtrack the process like how the URL is forwarded to the last. So let's see the history of burpsuite before this url. After seeing the request and response, I didn't get anything interesting.
So I continued backtracking again and looked into its previous HTTP connection. Again didn't get anything interesting. So I checked the previous history again.
After some more tries, I got something interesting. Some JSON value is used later to the download HTTP connection. I saved that info previously on a text editor so that I could backtrack efficiently.
{
"errno": 0,
"request_id": 556702044726912800,
"server_time": 1691756145,
"cfrom_id": 0,
"title": "/DownloadFile.txt",
"list": [
{
"category": "4",
"fs_id": "308208376422326",
"isdir": "0",
"local_ctime": "1691734185",
"local_mtime": "1691734174",
"md5": "a061925c4c32eea46eef95ebfc01d431",
"path": "/DownloadFile.txt",
"play_forbid": "0",
"server_ctime": "1691734185",
"server_filename": "DownloadFile.txt",
"server_mtime": "1691734185",
"size": "171",
"dlink": "https://d.terabox.com/file/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&dstime=1691756146&rt=sh&sign=FDtAER-DCb740ccc5511e5e8fedcff06b081203-2xdzc5NrPy1OIz5sFwNl7Bj20Jw%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=556702044726912757&dp-callid=0&r=269311579&sh=1",
"thumbs": {
"url1": "https://data.terabox.com/thumbnail/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691755200&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-GVhswkPNYGlgmi2K%2BiZMK95Y08E%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=556702044726912757&dp-callid=0&size=c140_u90&quality=100&vuk=-&ft=video",
"url3": "https://data.terabox.com/thumbnail/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691755200&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-GVhswkPNYGlgmi2K%2BiZMK95Y08E%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=556702044726912757&dp-callid=0&size=c850_u580&quality=100&vuk=-&ft=video"
},
"docpreview": "https://data.terabox.com/doc/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691756146&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-RVSo71qijs5EnMqvMzvWND0sQsY%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=556702044726912757&dp-callid=0",
"emd5": "4d11abc32pca5fb3fd229156e7445804"
}
],
"share_id": 2612301411,
"uk": 4400038120818
}
I have found the downloadable file again. That means we didn't need to look into the POST /share/download. Again I will start backtracking. After doing some backtracing I found something again. Though there was not any kind of information that is passed to the GET /share/list endpoint that respond with the download link.
After backtracking through the HTTP connection, I reached the initial GET /sharing/link endpoint where the HTML is loaded. So we have sorted two endpoints
GET /sharing/link
GET /share/list
Because the first endpoint generates an HTML document, let's look into it later. We first need to know that to generate the download URL from the server response which cookie value or query value is needed to be sent to the server. So at first try to send the data without the requested header info and cookies. After I sent the request the server didn't respond with the download URL. So I believe the header files and the cookies are important.
{
"errno": 0,
"request_id": 557953564544771840,
"server_time": 1691760808,
"cfrom_id": 0,
"title": "/DownloadFile.txt",
"list": [
{
"category": "4",
"fs_id": "308208376422326",
"isdir": "0",
"local_ctime": "1691734185",
"local_mtime": "1691734174",
"md5": "a061925c4c32eea46eef95ebfc01d431",
"path": "/DownloadFile.txt",
"play_forbid": "0",
"server_ctime": "1691734185",
"server_filename": "DownloadFile.txt",
"server_mtime": "1691734185",
"size": "171",
"thumbs": {
"url1": "https://data.terabox.com/thumbnail/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691758800&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-gz4xuW%2F1eJBMUoSSYzC9gHCaQHc%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=557953564544771859&dp-callid=0&size=c140_u90&quality=100&vuk=-&ft=video",
"url3": "https://data.terabox.com/thumbnail/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691758800&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-gz4xuW%2F1eJBMUoSSYzC9gHCaQHc%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=557953564544771859&dp-callid=0&size=c850_u580&quality=100&vuk=-&ft=video"
},
"docpreview": "https://data.terabox.com/doc/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691760808&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-BzVM%2FN7opjlUJJcNgbFzERMgusM%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=557953564544771859&dp-callid=0",
"emd5": "4d11abc32pca5fb3fd229156e7445804"
}
],
"share_id": 2612301411,
"uk": 4400038120818
}
Now let's add only cookies but not any header info to the client request from the server and see the response. The downloadable link is sent from the server in the response. So I think only the cookie is enough in the request headers dictionary. But still, to stay on the safe side, it should be good if we simulate the same kind of request through a Python program. For that, we can add some key, value pair like:
Accept: application/json, text/plain,
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36
{
"errno": 0,
"request_id": 558103019102725250,
"server_time": 1691761364,
"cfrom_id": 0,
"title": "/DownloadFile.txt",
"list": [
{
"category": "4",
"fs_id": "308208376422326",
"isdir": "0",
"local_ctime": "1691734185",
"local_mtime": "1691734174",
"md5": "a061925c4c32eea46eef95ebfc01d431",
"path": "/DownloadFile.txt",
"play_forbid": "0",
"server_ctime": "1691734185",
"server_filename": "DownloadFile.txt",
"server_mtime": "1691734185",
"size": "171",
"dlink": "https://d.terabox.com/file/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&dstime=1691761365&rt=sh&sign=FDtAER-DCb740ccc5511e5e8fedcff06b081203-SrV0hpuTXh2PRVgxP%2BwGsnK8YCk%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=558103019102725276&dp-callid=0&r=491067794&sh=1",
"thumbs": {
"url1": "https://data.terabox.com/thumbnail/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691758800&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-gz4xuW%2F1eJBMUoSSYzC9gHCaQHc%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=558103019102725276&dp-callid=0&size=c140_u90&quality=100&vuk=-&ft=video",
"url3": "https://data.terabox.com/thumbnail/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691758800&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-gz4xuW%2F1eJBMUoSSYzC9gHCaQHc%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=558103019102725276&dp-callid=0&size=c850_u580&quality=100&vuk=-&ft=video"
},
"docpreview": "https://data.terabox.com/doc/a061925c4c32eea46eef95ebfc01d431?fid=4400038120818-250528-308208376422326&time=1691761365&rt=sh&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-r8ghExHro0cbBBhoH3KHcuzRNuo%3D&expires=8h&chkv=0&chkbd=0&chkpc=&dp-logid=558103019102725276&dp-callid=0",
"emd5": "4d11abc32pca5fb3fd229156e7445804"
}
],
"share_id": 2612301411,
"uk": 4400038120818
}
All of the cookie's key-value pair is not always important. So now I need to find out what key-value pair is important in the cookie so that in the future when I publish the program, it will be easy to maintain. After I did some testing, I found only the ndus=************************************ value is important. That means if only this value exists in the cookie, the server will respond with the download URL. And the good thing is all the values that I used to request with this file, work with every file and the cookie value doesn't expire. I think it will take more than 30 days to expire the cookie.
So the project is quite completed. But there is a PROBLEM. Later I found one thing that the query value of jsToken changes over time. And the old jsToken value doesn't work with the cookie. To be specific only the last few characters of the jsToken change over time. So probably it is generated on universal epoch time. So now I need to find the source where the jsToken is initially sent from the server. So it's time to backtrack again. And as I already sorted the possible endpoints to check, I found the source quite easily. The source is on the response of GET /sharing/link where the HTML is initially loaded. I searched jsToken on the Burp Suite response section and got something good.
I used the BeautifulSoup library to get the script tags and from those tags, I extracted the jsToken value and decoded the jsToken value. Now this script is only generated when the ndus cookie is present. That means the user must be logged in.
So at last if I briefly say how the script should work that will be:
I need to get my ndus cookie value. To do that I need to install the Edit This Cookie extension and log in to Terabox and then find the ndus cookie value
Next, I need to request the initial file URL "https://www.terabox.com/sharing/link?surl=Cd4dLu8wvq7uBTD0izTqsA" with the ndus cookie value in request headers
After I need to extract the jsToken value from the specific script tag. It can be easily done by BeautifulSoup library.
The last step is to use the jsToken to the GET /share/list endpoint with the ndus cookie in the request headers and the server will respond with the direct download URL
Some examples are shown below of the CLI app
Thank you for reading this post. If you want the script please let me know on my Email or LinkedIn profile
Email: deepsarkerofficial@gmail.com
Subscribe to my newsletter
Read articles from Deep Sarker directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Deep Sarker
Deep Sarker
Passionate Python Backend Engineer with over 2 years of experience in backend development. I help companies optimize their web applications for superior performance and scalability, enabling them to achieve measurable results.