Filespace Data Migration

  • Updated

It is often necessary to migrate data between Filespaces, object storage providers, storage regions or when hosting multiple Filespaces for production vs. archive purposes. Additionally, you might also require data recovery between snapshots and the live Filespace.

In this article we will provide guidance as to achieve such scenarios and various methods dependent on your operating system environment. To achieve this you must run multiple Filespace daemons to attach individual mount-points for each daemon and Filespace instance.

Filespaces present their data as an extension of the local operating environment. This guide is designed to leverage our Filespace mount-points however applicable to any traditional data copy between to local drives or folders. At the end of the day, think of it as any other bulk file system transfer.

Our environment will assume you have a separate data volume for your cache and provide guidance on how to achieve configurations for Linux and Windows setups. You'll need to determine your operating system based on your data types, typically Linux and macOS have more flexible file/folder naming conventions to that of Windows.

It is important to confirm the destination has FileSystem.ForbidSpecialCharacters disabled before beginning transfer. Depending on your Filespace version, will dictate whether FileSystem.ForbidSpecialCharacters enabled by default. See Configure Filespace Settings

We're utilizing in this article an Amazon EC2 c5ad.2xlarge in the cloud. We would recommend a machine of at least 4 or 8 CPUs and 16GB RAM to support our Filespace daemon instances, along with the resources required for the copy process.

We recommend cache is hosted on the fastest possible disk technology (NVMe/SSD preferable) and sized to act as a suitable buffer to complement your Internet uploads. See Migration Strategies article for more on this topic. 

Your Internet bandwidth will play a big part in the transfer time noting that you will be making requests on the source Filespace, traversing the operating environment copy process and pushing data to the destination Filespace instance. 

It is often suitable to leave the cache as default 25GiB. Inside the cloud vs. on-premises architecture, we have sufficient bandwidth, as ideally the data in and data out should match, you should be able to evacuate your cache at close to the speed of your copy process, therefore queuing too much data in the cache is unnecessary. 

To summarise our architecture: we have a machine in the cloud, mounting our source and destination Filespaces where we will perform our native operating system copy process of Robocopy or Rsync. You can quite easily replicate this environment in any equivalent cloud, virtual or physical infrastructure.

If you require assistance, please do not hesitate to reach out to support.

 

Linux Migration

Screen terminal multiplexer enables the running of multiple terminal processes in the background to ensure they continue running while disconnected. 

It is always best to ensure you've appropriate space and a suitable underlying storage location for your cache therefore we will guide you through setting up our daemons.

We will assume that a recommended NVMe/SSD disk is available to host your `root-path`. 

Should a suitable device not be available to the system to separate your cache/metadata from the system disk, merely remove `--root-path` option from command-line daemon instructions. 

Example environment with Filespace 1 as our source migrating to Filespace 2 for our destination, utilizing a 300GB SSD `/mnt/data` metadata and cache location root-path.

1. Prepare Linux machine or VM and configure cache/metadata root-path data location.

    a. Install Lucid

sudo apt update
sudo wget https://www.lucidlink.com/download/latest/lin64/stable/ -O lucidinstaller.deb
sudo apt install ./lucidinstaller.deb -y

    b. Prepare NVMe for cache (adjust /dev/nvme1n1 accordingly lsblk output)

lsblk
sudo mkfs -t ext4 /dev/nvme1n1
sudo mkdir /mnt/data
sudo mount /dev/nvme1n1 /mnt/data
sudo chmod 777 /mnt/data

    c. Add fstab entry

sudo nano /etc/fstab 
        
    /dev/nvme1n1       /mnt/data   ext4    defaults,nofail        0       0  

    d. Reboot for fun and test fstab

sudo reboot

    e. Ensure NVMe mounted after reboot, fstab working

ls /mnt/data -lh

2. Share read-only credentials to Filespace 1 source. either / share or required shares.

3. Enable read-write credentials to Filespace 2 destination. this could be in a share for migration purposes and the client can cut/paste the data into required locations at a later date 

Cut/paste vs. copy/paste is a metadata transaction and doesn't physically move data inside Filespace (egress out and ingest an independent copy).

4. Configure Filespace 1 instance for our source mount-point

screen -S Filespace1 -dm lucid --instance 1 daemon --fs filespace1.domain --user fsuser --password userpwd --mount-point /mnt/filespace1 --root-path /mnt/data/lucid

5. Configure Filespace 2 instance for our destination mount-point

screen -S Filespace2 -dm lucid --instance 2 daemon --fs filespace2.domain --user fsuser --password userpwd --mount-point /mnt/filespace2 --root-path /mnt/data/lucid

6. Set caches, leaving free space on /mnt/data for metadata. Let's utilize our 300GB NVMe considering it is available in the compute node specification we have available with our c5ad.2xlarge instance.

    a. set Filespace 1 cache 

lucid --instance 1 config --set --local --DataCache.Size 100G

    b. set Filespace 2 cache

lucid --instance 2 config --set --local --DataCache.Size 100G

7. Rsync /mnt/filespace1 to /mnt/filespace2

screen -S Rsync -dm rsync -aAXvP /mnt/filespace1/ /mnt/filespace2 --log-file=/mnt/data/lucid/logfile.txt
No trailing slash on a source will mean copy this directory and contents of this directory as a directory into the destination path, whereas a trailing slash on source/ will result in copying of the contents of the source into the destination directory.

8. Monitor Screen sessions

    a. List sessions

screen -ls

    b. Attach to session

screen -r <session>

    c. Detach from session

ctrl+a d

9. It is important to thoroughly review your logfile.txt noting any errors or suspicious entries which require further investigation. Please cross reference any files/folders between the source and destination which standout within the log. Contact support with any queries.

10. Terminate Lucid instance daemons

lucid --instance 1 exit
lucid --instance 2 exit

Optionally, remove Bash shell history

for i in $(history | grep 'lucid' | awk '{print$1}' | sort -nr); do history -d $i;done
sed -i -e '/lucid/d' ~/.bash_history

 

Windows Migration (PowerShell)

As Windows doesn't have a terminal multiplexer like Screen we will leverage background PowerShell processes to run our 2x source and destination Filespace instance daemons.

We will utilize Robocopy in a Powershell prompt, outputting to our screen for progress and a log file.

We will assume in our environment example you will be assigning a specific disk for your cache and metadata to ensure suitable capacity and performance. 

If you will be leveraging defaults remove and ignore `root-path` options.

1. Provision read-only credentials to Filespace 1 source. either / share or required shares.

2. Assign read-write credentials to Filespace 2 destination. this could be in a share for migration purposes and the client can cut/paste the data into required locations at a later date 

Cut/paste not copy/paste is metadata transaction and doesn't move data inside Filespace.

3. Prepare cache/metadata data location via Disk Manager.

    a. Example environment with Filespace 1 to Filespace 2 and 300GB SSD E:\ metadata and cache location root-path.

4. Configure Filespace 1 instance.

Start-Process -WindowStyle hidden -FilePath "C:\Program Files\Lucid\Resources\Lucid.exe" -ArgumentList "--instance 1 daemon --fs <filespace1.domain> --mount-point c:\filespace1 --root-path e:\lucid --user <fsuser> --password <userpwd>" 

5. Configure Filespace 2 instance.

Start-Process -WindowStyle hidden -FilePath "C:\Program Files\Lucid\Resources\Lucid.exe" -ArgumentList "--instance 2 daemon --fs <filespace2.domain> --mount-point c:\filespace2 --root-path e:\lucid --user <fsuser> --password <userpwd>"

6. Set caches, 2x 100GB out of 300GB leaving free space on E:\ for metadata. Let's utilize our 300GB NVMe considering it is available in the compute node specification we have available with our c5ad.2xlarge instance.

    a. set Filespace 1 cache 

lucid --instance 1 config --set --local --DataCache.Size 100G

    b. set Filespace 2 cache

lucid --instance 2 config --set --local --DataCache.Size 100G

7. Robocopy process with log and output on screen.

powershell -command "robocopy c:\filespace1 c:\filespace2 /e /z /mt /r:10 /w:10 /np | tee 'e:\logfile.txt'"

8. It is important to thoroughly review your logfile.txt noting any errors or suspicious entries which require further investigation. Please cross reference any files/folders between the source and destination which standout within the log. Contact support with any queries.

9. Terminate Lucid instance daemons

lucid --instance 1 exit
lucid --instance 2 exit

Optionally, remove PowerShell command-line shell history

clear-history -CommandLine *lucid*
$HistorySavePath = (Get-PSReadlineOption).HistorySavePath; (Get-Content "$HistorySavePath") -notmatch "lucid" | Out-File "$HistorySavePath"

 

Efficient Snapshot Recovery

Applying a similar methodology to migrating data between a snapshot and the live Filespace. 

You can adjust accordingly the instance daemons to attach to a snapshot.

1. Retrieve your snapshot <id>.

lucid snapshot --list

2. Once you've identified your snapshot ID you can add it to your Filespace daemon instance with the following option

--snapshot <id>

3. Adjust your daemon instances for your Screen session or daemon processes and/or services.

Linux:

lucid --instance 1 daemon --fs <filespace.domain> --user <fsuser> --password <userpwd> --mount-point /media/filespace
lucid --instance 2 daemon --fs <filespace.domain> --user <fsuser> --password <userpwd> --mount-point /media/snapshot --snapshot <id>

Windows:

lucid --instance 1 daemon --fs <filespace.domain> --user <fsuser> --password <userpwd> --mount-point c:\filespace
lucid --instance 2 daemon --fs <filespace.domain> --user <fsuser> --password <userpwd> --mount-point c:\snapshot --snapshot <id>

4. Transfer your data dependant on your operating system and data recovery/transfer requirement preferences. 

Linux:

rsync -aAXvP /media/snapshot/ /media/filespace --log-file=/home/<user>/logfile.txt

Windows:

robocopy c:\snapshot c:\filespace /e /z /mir
Note: this Robocopy /mir example will replace/purge data in the live Filespace
adding Rsync --delete or --delete-excluded will purge the destination
 

Was this article helpful?

0 out of 0 found this helpful