My first article was about downloading movies from the page with a bit of parse and do everything in bash. This time the task gotta be more complicated. We need to download the movies from playlist which loaded by ajax.

First task is to find the check how the list with episodes appears on the page. I've checked in browser console ajax request and found PLAYLIST_ID in page source code.

<div class="playlists-ajaxed" data-playlist_id="666"></div>

So we should download page and get only the ID from it.

get_list_id() {
    echo $1 |
    wget -O- -i- --no-verbose | 
    hxnormalize -x | 
    sed -n 's/.*data-playlist_id="\([^"]\+\).*/\1/p'
}

I was really excited how this pretty function works!
Next we should make a request to custom url and get json in result.

get_json_list() {
    echo "https://somewebsite.com/ajax/playlists.php?playlist_id=$1" |
    wget -O- -i- --no-verbose | 
    jq -r .response | # get value by response key
    hxnormalize -x | # normalize html
    hxselect -i "li[data-id=\"0_0\"]" | # select videos only from first player
    sed 's/data-file/href/g' | #replacements to make hxwls work
    sed 's/<li /<a /g' |  #replacements to make hxwls work
    hxwls
}

Interesting thing, that we have "response" key in json response which contains html tags! Ah good old jquery days...
To get correct data from json I'm using "jq" program here. jq is a lightweight and flexible command-line JSON processor. https://stedolan.github.io/jq/
There are TWO playlists in html, so I selected li elements only for the first one. Great that they all have the same data-id attribute in li element. hxselect -i "li[data-id=\"0_0\"]".
To make the last command work - hxwls, which will parse HTML for the links I simply replace data-href attribute in li element to href and "<li" to "<a". Works perfect.
In result I have a variable with a list of urls, which can be processed like in previous script.

IFRAMES_LIST=$(get_json_list $PLAYLIST_ID)
if [ -z "$IFRAMES_LIST" ]; then
    echo "No iframes found. exit"
    exit
fi

for iframe in $IFRAMES_LIST;
do
    VIDEO_URI=$(get_video_uri $iframe)
    FILENAME=$(get_filename_from_url $VIDEO_URI)
    PLAYLIST="$VIDEO_URI$QUALITY/index.m3u8"
    ffmpeg -i $PLAYLIST -c copy -bsf:a aac_adtstoasc "$FILENAME.mp4" -y
done

Add new comment

The content of this field is kept private and will not be shown publicly.
  • No HTML tags allowed.
   .oooooo.                      ooooooooo.    ooooooooo.    ooooo      ooo  oooo        
d8P' `Y8b `888 `Y88. `888 `Y88. `888b. `8' `888
888 oooo oooo ooo 888 .d88' 888 .d88' 8 `88b. 8 888 .oo.
888 `88. `88. .8' 888ooo88P' 888ooo88P' 8 `88b. 8 888P"Y88b
888 `88..]88..8' 888 888`88b. 8 `88b.8 888 888
`88b ooo `888'`888' 888 888 `88b. 8 `888 888 888
`Y8bood8P' `8' `8' o888o o888o o888o o8o `8 o888o o888o


Enter the code depicted in ASCII art style.