WAYBACK SCRAPER FOR RDNA
Tutorial
1. first thing to do is run the program as administrator, then select “Set
Browser Compatibility” from the Settings menu, to make the program
compatible with your system. Then close and restart the program. You only
have to do this the first time you run the program. Do not change the
program file name as this will break the compatibility.
2. on the File Paths tab, select the file containing the product list, it’s located
in the program folder of the RDNA Downloader.
3. on the Browser tab, click the Load Product List button. The Exclude
Existing checkbox will exclude all products for which you’ve already
downloaded the product pages with the downloader, so normally you’ll just
keep that checked.
4. click the Load Wayback Page button. After some seconds (Wayback is
generally slow) you should now see a copy of the RDNA Store main page
that was saved on Wayback. Do not click on the product list until the page is
fully loaded.
5. click on a product name on the list in the left pane. After some seconds
the selected product page should show up in the browser. Check that the
product name on the product page matches the one you selected from the
list before saving the data.
6. click the Save Product Page button. This will save a copy of the page
source with the product data as well as copies of all the promo pictures. If
you see the pictures show up in the right pane and the product name is
marked green in the left pane, everything should have been saved correctly.
If you instead of the product page get a page with this message:
Most people didn't get all the product pages, if any, when downloading their
files with the RDNA Downloader, as RDNA removed the pages progressively
during the transfer of products to DAZ.
It appears though that Wayback Machine ( http://archive.org/web/ ) has
stored if not all then a lot of those missing pages (thanks to Haslor for
pointing that out).
You’ll need the original store links to the RDNA product pages however to
look up the pages on Wayback, so the Create Product List feature was
updated in version 0.537 of the RDNA Downloader to also include these
links.
It’s troublesome though to look up all the pages and save the content
manually so the Wayback Scraper for RDNA has been developed to semi-
automate the process, based on the product list generated by the
downloader.
Before you start though make sure that your product list is up to date so it
includes the links. If it has been generated by version 0.537 or later of the
downloader, and doesn’t include Gift pages, it should be OK. If it includes
Gift pages however the links will be mixed up because of a bug in version
0.536 and earlier, in this case you must create a new list from scratch using
the latest version 0.540 where this bug is fixed. Alternatively you can edit
and update the product list, see RDNA Downloader Tutorial under Create
Product List update 2. Sorry for the inconvenience…
If you’re not sure which version you created the product list with, open the
product list file and check if each line include an URL link. This file can be
found in the program folder of the RDNA Downloader under the name
“RDNA_product_list_with_product_page_urls.txt”.
Using the Scraper
then click the Back button to return to the previous page, and continue with
the next page on the list instead (point 5. and 6. above). Do not click “Save
this url in the Wayback Machine” as this will just save the now disabled
(empty) page for that product which may confuse others trying to access the
page.
Notes:
If a product page looks messed up saving product data will usually work
anyway. If not, you can try to find another version of the page if such one
exists. Links to other versions can be found on op of the page, below the
address field.
There will usually be some extra pictures included like thumbnails and
pictures belonging to other products. some of these will look like they’re
broken (an X instead of the picture). In fact they’re not, they’re PNG files
with a wrong JPG extension which apparently confuses the picture browser.
There seem to be some of these misnamed files on practically every page.
The product page data will be saved in folders named after each product in
the “ProductData” folder located in the program folder. The data needs to be
processed into the same format as the pages downloaded with the
downloader, I’ll add some code later that can do that as well as remove
thumbnails and other irrelevant stuff. The important thing is to get the raw
data and pictures before they become unavailable at the end of January
2017.
The picture list which will show in the right pane when you save data for a
product will be cleared as soon as you click on a product in the left pane.
You actually don’t have to complete loading a page with pictures and
everything before you can save the data. Try to click as soon as you can see
the RDNA menus and the product name on the store page; if you see the
pictures show up in the right pane everything should have been be saved
and you can continue with the next product immediately. If you don’t see any
pictures just click again after a few seconds, until they show up. I believe
there should be at least one picture saved for each product page, at least I
haven’t come across any yet without any at all.
The Product List is being locked during some part of the process, if for some
reason it should remain locked and you can’t access it just click Enable
Product List from the top menu.
The program will check for updates when it starts. You can also check
manually under Help > Check for Updates.
Updated for version 0.501
PROMOPIC GRABBER FOR RDNA
This program can download the promo pictures for every product in the
whole store, or, if you have a product list generated by the RDNA
Downloader, it can be set to only download those for the products you
have.
What it can't unfortunatly is to get the product page data, the only known
way to get these is by using the Wayback Scraper.
GENERAL INSTRUCTIONS:
Just click the Start button. If you click Cancel, and leave the program open,
you can continue from where you left by checking Resume and then click
Start again.
If you Cancel and close the program, to continue later, first write down the
Product ID number and then enter that number in the First box to continue
from there, when you start again. You can also check how far you've come
in the picture_log.txt file (if you download everything) or the
picture_log_user.txt file (if you only download files for owned products).
These files are located in the same folder as the program.
When Range 1 is finished, just select Range 2 and continue.
Specifically for downloading pictures for your own products only:
Check the Get pictures for owned products only checkbox. A dialog will
open, browse to the programfolder for the RDNA Downloader and select
the RDNA_product_list_with_product_page_urls.txt file (the dialog will
only open the first time you check the box, the file will be loaded
automatically if you cancel and close the program, and start over again
later).
Check that the checkbox is still checked, and click Start.
The pictures will be located in the pictures (if you download all) or
pictures_user (if you only download pictures for owned products) folders
located in the same folder as the program.
Tutorial