Folks,
We have directories that contain 1.8million+ files. The filesystems
are VERY slow and they are vxfs. Is there a limit on how many files
can be in this a vxfs directory before it take a performance hit?
thanks,
keith
|
|
0
|
|
|
|
Reply
|
codybear
|
8/10/2008 10:15:40 PM |
|
In article <e00f6c34-51c7-4d1c-b843-4c2a7af51b97@m36g2000hse.googlegroups.com>,
codybear <keithclay@gmail.com> writes:
> Folks,
>
> We have directories that contain 1.8million+ files. The filesystems
> are VERY slow and they are vxfs. Is there a limit on how many files
> can be in this a vxfs directory before it take a performance hit?
>
>
> thanks,
>
> keith
Hi Keith,
from experience I'd say around 100000. Our backup's (via Netbackup) have problems if the file count is much higher.
Yours, Hans Schwengeler
|
|
0
|
|
|
|
Reply
|
t7321
|
8/12/2008 5:40:14 AM
|
|
On Sun, 10 Aug 2008, keithclay@gmail.com wrote:
> Folks,
>
> We have directories that contain 1.8million+ files. The filesystems
> are VERY slow and they are vxfs. Is there a limit on how many files
> can be in this a vxfs directory before it take a performance hit?
>
>
> thanks,
>
> keith
Those are some fairly large directories. :-)
What version of HP-UX are you running? And what version of VxFS?
--
Carl Davidson (carl.davidson@hp.com)
Hewlett-Packard Company, Cupertino, CA 95014
You can't please all of the people any of the time.
|
|
0
|
|
|
|
Reply
|
Carl
|
8/12/2008 5:27:03 PM
|
|
codybear wrote:
> We have directories that contain 1.8 million+ files. The filesystems
> are VERY slow and they are vxfs. Is there a limit on how many files
> can be in this a vxfs directory before it take a performance hit?
The experts on the ITRC say you should not use the directory structure
as a database. 2 million is way to much.
By adding an extra level with 1000 max, you can really reduce the number.
|
|
0
|
|
|
|
Reply
|
Dennis
|
8/13/2008 3:52:41 AM
|
|
codybear <keithclay@gmail.com> writes:
> Folks,
>
> We have directories that contain 1.8million+ files. The filesystems
> are VERY slow and they are vxfs. Is there a limit on how many files
> can be in this a vxfs directory before it take a performance hit?
Hi!
Which layout version of VxFS do you have? Recent layouts are said to handle
many files in a directory better. Current layout for VxFS 5 is 7.
Despite of that a hash-table-based directory lookup will suffer from hash
clashes. Every hash table has a design length regarding the number of table
entries and fil name lengths. The more hash collisions you have, the more
lookups (CPU) is required to look up a name.
Also, it highly depend how you access a directory:
A "ll" will read all entries, get attributes of each, and sort those, while a
"find" will only lookup the names. the fastest access is to directly probe for
a single file like "test -f a_name".
The general method to optimzie lookups is to reduce the number of files in a
directory (to less than 100 I'd suggest). Instead of looking up a file like
"01234567890" distribute them (assuming the names are equally distributed) in
a structure like "01/23/45/67/89/0" (i.e. a 5-level directory hierarchy with
100 entries at most each). You could also use the original file name and
compute a strong hash like MD5 or SHA-1 to determine the directories. As those
hases distribute quite equally, you might pick any "digits" for distributing
the files.
For example: If your eight files produce these MD5 hashes (fingerprints):
321d1b34ba06106ad8d15dbd0cff4252
8643682de5a441e36627fa810c7d5db2
8fa5e4afb2da5c0e3314c760210d3819
9c6b1d1b2bbd59b8eaaedbd1c1768a9f
9d02b629b97700478868c931969ae55a
9e35af0a9281d878f76cacfaea63ee75
a345b8cfba5c00ead9cd1c731783645e
d0b678f97e61987a160d737092a5e3cc
You could use the first four characters to put your files into
3/2/1/d/
8/6/4/3/
8/f/a/5/
9/c/6/b/
9/d/0/2/
9/e/3/5/
a/3/4/5/
d/0/b/6/
That is 16 entries per directory-level, 65536 "buckets" altogether. If you use
two characters (like 32/1d/1b/34/) you'll have 256 entries per directory
(about four billion buckets). With just three two-character levels you'd have
16 million "buckets".
So if you have control over the application that creates and accesses those
files, you could easily implement that. Alternatively you should consider
using some light-weight database like sleepycat's. I'd advise for the latter,
because when backing up those files (assuming they are rather short), the
backup software still has to enumerate them all before deciding whether to
save them or not.
Not a HP specialist ;-)
Regards,
Ulrich
|
|
0
|
|
|
|
Reply
|
Ulrich
|
8/18/2008 9:09:36 AM
|
|
|
4 Replies
181 Views
(page loaded in 0.085 seconds)
Similiar Articles: Autofs issue with home directory on AIX.- anyone? - comp.unix ...The directories are exported from a Solaris NFS file Server. If I cd to "/home/user ... 2.1 problems - comp.sys.sun.admin veritas filesystem and directories with large number ... bacula vs. amanda - comp.unix.solaris... daemons on any number of systems. It also supports backup to files or CD/DVD and not just tapes. By the way, "big ... view" of a large filesystem ... Tivoli (TSM) Veritas ... NFS v4 id mapping and nfsmapid - comp.unix.solaris... undertaking simply in time-to-walk-the- filesystem ... you get on Rational Rose Clearcase where directories ... site we probably have relatively large numbers of files, vs some ... Random access to content of archived files without extraction ...... doesn't pose a limit to output file size (but the underlying filesystem ... file X.dat is e.g. a 10 GByte large file ... the inode density limits the number of files/directories ... Archive/copy over 2GB - comp.os.linux.hardwareA fat16 filesystem has a file size limit of 2GB, while a fat32 ... How to COPY and MOVE OS (LINUX) directories using C++ ... tar > 2GB file - comp.sys.hp.hpux Copying Large files ... [comp.publish.cdrom] CD-Recordable FAQ, Part 1/4 - comp.publish ...Archive-name: cdrom/cd-recordable/part1 Posting-Frequency: monthly Last-modified: 2008/10/09 Version: 2.71 Send corrections and updates to And... Cannot label disk when partitions are in use as described - comp ...Enter the slice number: ( ie. 4 ) Enter the new starting block and specify how big the sice should be. ... :-) "You can tune a file system, but you can't tuna fish." Quickest way to bulk copy many files from one disk cluster to ...... see man page) and to make sure that the source filesystem ... there is a big difference between copying one big file ... the current date/time, and the sizes of some directories ... No space left on device. - comp.unix.solaris/proc is not a real filesystem. None of the files inside are real, they ... Make that limit 127 GB! 137 is ten too big! ... Veritas VM - Moving Volumes Between Disk Groups ... NFS slowdown - Solaris client, Linux server. - comp.unix.solaris ...The output files are fairly large (> 6 GBytes) and ... there a lot of files in the directories that you run the ls -l on? (i.e. > 4096 files ... change is the increased number of ... Veritas Filesystem - DataDisk - MainVeritas File System . Veritas Filesystem ( VxFS) is ... maximum number of inodes allowed, an indication of whether the filesystem supports large files ... Reorganise directories Veritas File System - ScribdVeritas File System Veritas Filesystem ( VxFS) is ... maximum number of inodes allowed, an indication of whether the filesystem supports large files ... Reorganise directories ... 7/20/2012 7:48:07 AM
|