iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: http://phabricator.wikimedia.org/T376892
⚓ T376892 Expand media backup storage available space to 960 TB per datacenter
Page MenuHomePhabricator

Expand media backup storage available space to 960 TB per datacenter
Closed, ResolvedPublic

Description

We recently run into low space issues, expand the dedicated size for media backups to 960 TB (6 hosts) on both eqiad and codfw.

image.png (842×1 px, 58 KB)

Current utilization (as of 2024-10-10) is 645 708 607 840 309 bytes.
Current utilization (as of 2024-11-21) is 656 178 463 691 108 bytes.

Event Timeline

Change #1082172 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] mediabackups: Setup new host for mediabackups backup[12]012

https://gerrit.wikimedia.org/r/1082172

Change #1082172 merged by Jcrespo:

[operations/puppet@production] mediabackups: Setup new host for mediabackups backup1012

https://gerrit.wikimedia.org/r/1082172

Change #1091731 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] backup: Move Dell bacula hosts to mediabackups

https://gerrit.wikimedia.org/r/1091731

Change #1091731 merged by Jcrespo:

[operations/puppet@production] backup: Move Dell bacula hosts to mediabackups

https://gerrit.wikimedia.org/r/1091731

Mentioned in SAL (#wikimedia-operations) [2024-11-20T14:23:26Z] <jynus> starting resharding of commons backup files into new host backup1010 T376892

Mentioned in SAL (#wikimedia-operations) [2024-11-20T15:31:53Z] <jynus> starting resharding of commons backup files into new host backup2010 T376892

Change #1093377 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] mediabackup: Setup backup1010 as the 6th media backup host in eqiad

https://gerrit.wikimedia.org/r/1093377

Change #1093379 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] mediabackup: Setup backup2010 as the 6th media backup host in codfw

https://gerrit.wikimedia.org/r/1093379

14 hours more for transfers to complete.

Change #1093377 merged by Jcrespo:

[operations/puppet@production] mediabackup: Setup backup1010 as the 6th media backup host in eqiad

https://gerrit.wikimedia.org/r/1093377

Change #1093379 merged by Jcrespo:

[operations/puppet@production] mediabackup: Setup backup2010 as the 6th media backup host in codfw

https://gerrit.wikimedia.org/r/1093379

Capacity reached 94.2% and finally it is on a downward trend: 93.7% 🎉

Timestamp is in CET:

[12:26:36] <jinxer-wm> RESOLVED: DiskSpace: Disk space backup2011:9100:/srv/objectstorage 5.998% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=backup2011 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace

This is now done. Catchup and purging is ongoing, but after that finishes, we should be able to store almost 1 PB of media backups on both datacenters. Backups will continue without problem or manual interventions in the next months.

Data is not 100% balanced, so that may require later tunings, but the initial goal, which was expanding the existing space and make sure hosts were not full, was accomplished already.