Workflow Guide
Workflow Details
- Prerequisites: An understanding of LucidLink Filespaces and Windows Operating System Command Prompt
- Time required: 30 minutes
Introduction
Maintaining a copy of your critical data in a secure location is a key aspect of a robust disaster recovery strategy. While LucidLink Filespaces offer excellent redundancy and availability, backing up your data to an additional storage system ensures that you have multiple layers of protection.
Robocopy is a command-line utility included in Windows that provides advanced file copy operations. This guide explains a specific Robocopy command designed to efficiently copy files while preserving attributes, timestamps, and providing detailed feedback on the process. The command also employs multithreading to optimize the copying speed.
The architecture requires the following components:
- LucidLink Filespace
- Windows Operating System with internet connection and the LucidLink Client installed
- Network Attached Storage (NAS) Device connected to Windows OS.
Using a LucidLink Filespace as source and a NAS as destination, the robocopy command will copy from the LucidLink Filespace to the NAS for a disaster recovery use case.
The Command
Here's the Robocopy command we'll be discussing:
robocopy <source> <destination> /E /COPY:DAT /DCOPY:T /XO /R:1 /W:1 /V /MT /Z /LOG:<logfile>
What This Command Does
- Recursively copies all directories and subdirectories, including empty ones, from Source to Destination.
- Copies data, attributes, and timestamps of each file.
- Copies directory timestamps.
- Excludes older files, avoiding overwriting existing files with older versions and recopying files that already exist and match on timestamp and size.
- Retries failed copies once with a 1-second wait between retries.
- Provides verbose output, giving detailed information about the copy process.
- Uses multithreading with the default of 8 threads, significantly speeding up the operation.
- Uses “restartable” mode to resume interrupted copy operations.
- Logs the output of the operation to a specified log file.
Explanation of Each Component
robocopy: Initiates the Robocopy utility.
<source>: The source directory where the files are copied from.
<destination>: The destination directory where the files are copied to.
Switches and Options
Switch |
Effect |
/E |
Copies all subdirectories, including empty ones, ensuring the entire directory tree is copied. |
/COPY:DAT |
Specifies what to copy for each file:
|
/DCOPY:T |
Copies directory timestamps. By default, Robocopy does not copy directory timestamps. |
/XO |
Excludes older files. Only newer files are copied, preventing overwriting existing files with older versions. |
/R:1 |
Specifies the number of retries on failed copies. Set to 1, meaning Robocopy will retry once if a file fails to copy. |
/W:1 |
Specifies the wait time between retries, in seconds. Set to 1 second, meaning Robocopy will wait for 1 second before retrying a failed copy. |
/V |
Enables verbose output, providing detailed information during the copy process. |
/MT |
Enables multithreaded copying, defaulting to 8 threads. This can significantly speed up the copy process, especially for large numbers of small files. To specify a different number of threads, use /MT:n where n is the number of threads (e.g., /MT:32). |
/Z |
Enables “restartable” mode, allowing the copy operation to resume from where it left off if interrupted. |
/LOG<logfiles> |
Outputs the results to a specified log file. Replace <logfile> with the actual path and name of the log file you want to use (e.g., C:\Logs\RobocopyLog.txt). |
Robocopy Workflow Considerations
-
Create a LucidLink Filespace user with read-only permissions to the root of the Filespace. This will ensure that source data is not modified by the process.
-
No, Robocopy checks files based on time/date and size and skips files that match so you can use the same command for the initial copy.
-
If the robocopy script runs after the source has been modifed by ransomware then the files will be copied to the destination. If you suspect ransomware has modified your source data, do not run the robocopy script and cancel any scheduled running of the robocopy script until you've recovered your source data. Note that if the source is LucidLink, LucidLink snapshots can help you recover from ransomware attacks.
-
A partially uploaded file will be copied, but the remaining data will be copied in the next backup run.
-
It will be copied as a new file or folder to the destination. The original folder will remain unchanged on the destination.
-
If the file existed on the source during the previous backup, it will not be deleted from the destination. If the file was created and deleted between backups, it will not exist on the destination.
-
No, backup data should only be accessed if the primary storage is unavailable. Backup data should be tested during allocated testing times without modifying it during backups.
-
Yes, to access data in a LucidLink Filespace, a LucidLink client must be installed.
-
It depends on your PC's capabilities. The default is 8 threads, but you can specify more during testing to optimize performance.
-
Mirroring is possible using /MIR, but it's not recommended for Disaster Recovery (DR) workflows because it deletes files from the backup if they are deleted from the source. Use /MIR cautiously and consider using it with /L /X for a dry run first.
-
Check the robocopy log file (see next section) Compare the folder properties (see next section)
-
No, robocopy on Windows can copy data that meets the requirements of Windows supported characters.
-
Yes, see documentation for rsync or rclone on MacOS or Linux.
-
No changes will be made to the destination or source data. The log file will say:
2024/06/04 10:04:02 ERROR 2 (0x00000002) Accessing Source Directory <source> The system cannot find the file specified.
-
Egress (data download) charges are included for LucidLink Advanced and Basic Filespaces however a Custom Filespace may generate data transfer charges from your chosen storage provide (e.g. AWS).
-
Yes. Running the LucidLink daemon as a service will mean it survives operating system reboots. See the knowledge base article for more information.
Checking the backup process status
Robocopy Log
In a Robocopy log, the summary table provides a breakdown of the files and directories processed during the copy operation. Each column in this summary has a specific meaning.
Example Summary Table
Total Copied Skipped Mismatch FAILED Extras
Dirs : 267718 267718 267717 0 0 53
Files : 12388 10 12378 0 0 0
Bytes : 985.343g 5.799g 979.544 g 0 0 0
Times : 2:03:44 0:30:46 0:00:00 0:26:35
Speed : 3372453 Bytes/sec.
Speed : 192.973 MegaBytes/min.
Ended : 07 June 2024 04:57:24
-
Total:
- The total number of files or directories that were considered during the copy operation. This includes files and directories that were copied, skipped, mismatched, failed, and extra.
-
Copied:
- The number of files or directories that were successfully copied from the source to the destination.
-
Skipped:
- The number of files or directories that were not copied because they already exist in the destination and are identical to those in the source (based on criteria such as size and timestamp).
-
Mismatch:
- The number of files or directories that were not copied due to a mismatch in some criteria (e.g., size, attributes, or timestamp) that caused Robocopy to skip them.
-
FAILED:
- The number of files or directories that Robocopy attempted to copy but failed. This could be due to various reasons such as permissions issues, network errors, or other problems.
-
Extras:
- The number of files or directories that exist in the destination but not in the source. These are considered "extra" files or directories and can be indicative of files that were deleted from the source but are still present in the destination.
The summary provides a quick overview of the results of the Robocopy operation, allowing you to identify any issues or discrepancies that occurred during the process.
Comparing Folder Properties
You can right-click a folder and select Properties to display the folder properties.
- x Files, x Folders: The number of individual files and subfolders within the folder, providing a count of the items directly inside it.
- Size: The total amount of data contained within all the files in the folder, displayed in units such as bytes, kilobytes, megabytes, or gigabytes.
- Size on Disk: The actual amount of disk space used by the folder, which can be larger than the total file size due to storage allocation in disk clusters.
- Attributes: Information about the folder's properties and access permissions, such as Read-only (files can be read but not modified) and Hidden (folder is not visible in normal view).
- Created Date: The timestamp indicating when the folder was originally created, showing when the folder first came into existence.
After the first backup process runs, comparing the source and backup properties can help you check if the backup was successful.
- x Files, x Folders: The number of files and subfolders should match (excluding any robocopy log files you may have written to the destination folder)
- Size: Should be the same.
Comparing "Size" and the File/Folder count" is a quick and easy method to verify that a backup has captured all files and folders from the original location. "Size" ensures that the total data volume matches, while "Contains" checks that the number of files and subfolders is identical. This approach can quickly highlight major discrepancies, such as missing files or folders, or significantly different data sizes.
After additional backups, the properties might be different if any of the following happens:
- A file is renamed on the source. This will trigger a copy of a new file to the destination (without deleting the original name file from the backup)
- A file is deleted from the source (the file will not be deleted from the back up).
Some other properties could differ.
- Size on Disk: This value can differ slightly due to differences in disk storage allocation between the source and destination drives. Different file system structures or cluster sizes can cause this discrepancy.
- Attributes: The Read-only and Hidden attributes may be intentionally different if you changed them for the backup process or if the backup tool alters them. This typically doesn't affect the data integrity.
- Created Date: The creation date of the backup folder will generally differ because the backup folder is created when you perform the backup, not when the original folder was created. Subfolder timestamps should match if you use the /DCOPY:T command.
For more reliable verification, additional methods can be employed:
- Checksum/Hash Comparisons: Generate and compare checksums (like MD5, SHA-256) for each file in both the original and backup locations. Identical checksums ensure that the files are exactly the same byte-for-byte.
- File Comparison Tools: Use specialized software (e.g., WinMerge, Beyond Compare) to compare files and folders in detail. These tools can compare not only file content but also timestamps and attributes.
- Automated Scripts: Use scripts or programs to automate the comparison process, ensuring consistency and thoroughness in large data sets.
These methods provide a higher degree of certainty that the backup is an exact replica of the original, ensuring data integrity and reliability in the event of a restore.
Automating the robocopy process
Automating a batch script in Windows can be done using Task Scheduler, which allows you to schedule tasks to run at specific times or intervals. Follow our Automating a robocopy backup process on Windows KB article for details.