COE Data Storage


By the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet**. The College's storage needs have grown accordingly, whether it be storage for research data to maintain compliance, college and departmental operational data, student team and project data, or data needed for internal or external collaboration.

CTS provides reliable network-accessible data storage for all of the College's academic departments and business units, as well as research data storage.

We also support storage offered through Office365 include OneDrive and SharePoint.

COE Main Storage Hardware

Our operational and research file storage solution consists of two Dell PowerVault MD3640 storage arrays connected via 12GB/s SAS to a PowerEdge R730 server. One of the storage arrays functions as the main file server repository, while the other is dedicated to backups. Both are configured as RAID10 arrays with multiple hotspares, with 203TB usable for files and 210TB available for backups (the difference being that the file storage array has two dedicated SSD cache drives in place of additional storage drives).

Data deduplication and are both virtual machines, with all storage volumes mounted as separate virtual hard drives formatted using NTFS with deduplication enabled. Currently, deduplication is showing space savings of an average of 50%. Because each research folder is a separate volume, research groups will enjoy the full benefits of the dedup savings for their data (as a real-world example, there is a research group with over 14TB of data, deduped down to under 5TB, leaving them 5TB of free space). If you copy a large amount of data up to your folder, and later notice that the free space has gone back up, that’s the result of the new data being deduplicated. If you want to understand how this process works, Microsoft has a write-up that explains it very well.

Every night, a script runs that updates a file in each group's folder called StorageUsageReport.txt. In it, the line marked SavingsRate will give the percentage saved through deduplication.

File server backups

CTS backs up data on files and researchfiles according to this schedule.