I am using SmartFTP Professional.
We are working with a vendor who collects digital newspaper files from around the state. They have the newspaper companies/staff upload the new files (mostly pdfs and png previews) via FTP to a server. We then connect to the server and download the new files. We have a four hour daily window to download new files. I set up Scheduler to add the root directory to the queue with a one-way transfer and with "Use cached local folder tree" enabled. I set up Timer to process the queue during our time window. Everything so far is working as intended.
The problem I am running into is that once the queue is being processed, SmartFTP checks every single file on the remote server to determine if it should be downloaded or not. This process is quite time consumiung because there are about 1.2 million files in 30,000 directories on the remote server (the tree is root>newspaper>date>files). The result is that the queue doesn't process fast enough for us to keep up with new files (new files are added on a daily and/or weekly basis). It takes several weeks for SmartFTP to run through the entire tree one time and in that time, we get farther behind since new files are added daily/weekly to directories that have already processed. The whole thing starts over when SmartFTP processes the root again (Scheduler adds it to the queue every two weeks though it takes longer than that to process).
I have identified 90 active parent directories--that is 90 newspapers who are regular contributors and whose child directories will always have new files. I've considered setting up Scheduler for each of these parent directories to avoid the time it takes to process the inactive directories and then scheduling the root once every two months or so to ensure we're not missing anything, but I'd like to avoid doing this if possible. For one, the active directories are only going to keep getting larger so I'm really just remedying the issue short term and for two, it will still take weeks to process these active directories because they are already so large.
We are downloading these files for archival purposes, so it is not neccessary that our files are perfectly synced within 24 hours. However, we went from two weeks behind when we started (we were given a complete copy of the files to start from the vendor) to two months behind, so we are losing ground on staying up to date. If I could keep the remote and local copies within a week or two of each other or at least no longer lose ground on it, I would be very happy.