Issue
Premise: I'm a bit of a newbie in using Amazon AWS or Linux partitioning in general.
So, I need to train a Tensorflow 2.0 Deep Learning model on a g4dn.4xlarge instance (the one with a signle Nvidia T4 GPU). The setup went smoothly and the machine was correctly initialized. As I see in the configuration of my machine I have:
- 8GB root folder;
- 200GB of storage (that I was able to mount on startup using this guide https://devopscube.com/mount-ebs-volume-ec2-instance/#:~:text=Step%201%3A%20Head%20over%20to,text%20box%20as%20shown%20below)
And here is the result of lsblk
:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 33.3M 1 loop /snap/amazon-ssm-agent/3552
loop1 7:1 0 32.3M 1 loop /snap/snapd/11588
loop2 7:2 0 70.4M 1 loop /snap/lxd/19647
loop3 7:3 0 55.5M 1 loop /snap/core18/1997
loop4 7:4 0 55.4M 1 loop /snap/core18/2066
nvme1n1 259:0 0 209.6G 0 disk /newvolume
nvme0n1 259:1 0 8G 0 disk
└─nvme0n1p1 259:2 0 8G 0 part /
The problem: I was following this guide https://medium.com/quantrium-tech/installing-tensorflow-2-with-nvidia-gpu-on-google-cloud-instance-a8dde3746f23 to install the necessary drivers to be able to use the GPU with tensorflow but I ran into a "no space left on the device" problem as all the packages required are more than what I have available as space (8 GB).
What I've Tried: I tried installing the drivers on the disk I've mounted (/newvolume) but they go to the root anyway (probably that was stupidly done). I've tried merging the two disks with some sketchy guide but with no success or progress.
The Question: Is there any way to merge the two partitions to have 200GB+ of root so I can install the necessary drivers without having space problems? Or are there any other workarounds?
My goal is not expanding the root folder through the configuration of another instance with more space but to make use of the 200GB disk (nvme1n1).
Many thanks!
Solution
- Expand the existing EC2 root EBS volume size from 8 GB to 200 GB from the AWS EBS console. Then you can detach and delete the EBS volume mounted on /newvolume
OR
- Terminate this instance and launch a new EC2. While launching the instance, increase the size of root volume from 8 GB to 200 GB.
Answered By - Jyothish Kumar S