DIYbanter - View Single Post

764hho

"polygonum" wrote in message
...
On 28/03/2016 06:00, Johnny B Good wrote:
On Sat, 26 Mar 2016 09:28:35 +0000, The Natural Philosopher wrote:

On 25/03/16 21:20, Vir Campestris wrote:
On 24/03/2016 22:43, The Natural Philosopher wrote:
Oh the joys of Linux, and no de fragging ever unless the disk is 100%
full

I've heard this said, and I never can work out how.

If I put 5000 files on my disk, and delete every alternate one, how can
it not be fragmented?

Well of course it is somewhat, but the point is that new files tend to
be written in the middle of the biggest free space, depending on the
actual disk format in use, so they tend to simply grow linearly.

Fragmentation isn't a file in a random place, its a file in dozens of
random places, so to get the entire contents takes many seeks.

http://www.howtogeek.com/115229/htg-...x-doesnt-need-
defragmenting/

I was quite surprised at the explanation. The strategy of scattering
files across the disk volume 'in order to allow them space into which to
grow' seemed so bogus, seeing as how most files are edited by writing a
complete new copy before renaming the original as a backup file, either
temporarily simply to avoid the naming conflict or as a safety rollback
copy.

This was actually the complete opposite of the one I designed for a
solenoid operated data cassette deck where I formatted each side of a C60
sized tape into 168 by 2048 byte blocks (a max formatted capacity of
328KB per side of each tape after using the middle two blocks for a
duplicated directory and LNKTBL (an 8 bit version of what I later
discovered was called a File Allocation Table" (FAT) in MSDOS
terminology).

Here, the big idea was to preserve the larger sequences of free blocks
as much as possible when writing new data to a part used tape that had
randomly scattered free space blocks as a result of previous file
deletions. I didn't want to use the next available 3 block space just to
save a 2 block file into without searching the LNKTBL (FAT) for the
existence of such a conveniently 2 block sized chunk of space.

The tape, being effectively a single huge track of 168 blocks requiring
fast forward/rewind operations could well do without unnecessary
fragmentation, particularly of the larger files (at 5 seconds to read an
8K chunk of tape and 14 seconds end to end search time from first to last
block) so it was important to preserve the larger blocks of free space
"for better things" than a mere 1 to 3 block chunk of data.

Since it's rather unusual for data processing software to write changes
*directly* into the one and only copy of a working file (text editors,
word processors and so on) risking all in the event of a system crash or
a power outage, virtually every app wrote the changes as a completely new
file (usually after renaming the original to avoid the naming conflict).
That way, should the worst happen, you only risked losing the changes
rather than the whole lot. The FAT was duplicated to avoid the same risk
(I duplicated both the directory and the LNKTBL which neatly fitted into
just a single 2K block, hence the use of two of them in my Tape filing
system).

With my strategy of minimising fragmentation of free space being *such*
a "No Brainer", I've always assumed MSFT's FAT based FSes used a similar
strategy, BICBW.

However, I suppose the extra breathing space at the end of each file's
data block allocation can still prove useful even when the files aren't
being directly edited but rather deleted after making sure its updated
replacement was successfully written to disk ensuring another larger
chunk of free space would be available for yet an even later 'edit'
whether of the same file or a completely different one. On an HDD,
there's no great detriment in that strategy, unlike on a linear tape
where such a strategy would 'Suck Big Time'(tm).

Incidently (aside from the use of SSDs), the key to minimising
fragmentation induced performance loss in a MSFT FS is optimised
partitioning of the HDD into 3 partition spaces (OS, Apps and data
partitions).

For a win7 setup, you'd probably need a 40GB "drive C" for the OS and
pagefile, perhaps another 30 or 40GB drive D for the common or garden
apps
with the remaining 850/1780 GB space on your 1 or 2 TB drive allocated to
a general purpose drive E data volume. This splitting of an HDD's space
avoids the OS file activities poisoning the other disk volumes with
fragmentation activity.

Similarly, but to a much lesser degree, the apps volume and it makes it
impossible for windows to thinly spread its 60 odd thousand system files
right across the whole of the disk platters to "Mix it" with the rest of
your apps and data files (the "Drive C" and "Drive D" partitions becoming
effectively "Short stroked 40GB disk portions of a humongous 931 or 1862
GB HDD, reducing seek times whilst enjoying the fastest SDTR region of
the disk.

On those rare occasions when you think it *just* might be worth
spending
a few minutes defragging drive volumes C and D, that's just about the
size of the defrag job, minutes[1] rather than hours/overnight in the
case of the classic lazy OEM *******'s trick of 'everything into a single
huge disk partition'[2].

[1] In the case of the 8GB win2k partition on the 1TB drive, it literally
was just a couple of minutes to completely defrag drive C (and similarly
for the 20GB apps volume - less file churn for a kick off).

I wasn't worried about fragmentation on the large data volumes which,
after several months, could take anywhere from 1 to 3 hours. Most of the
data were large GB sized media files which didn't suffer performance
issues on playback or copying/moving although intense video processing of
a badly fragmented movie file would suffer a modest performance drop,
largely mitigated by my arranging the source and destination folders to
be on different physical disk drives to eliminate head contention which
neatly negated the worst effects of file fragmentation anyway.

[2] Even when the ******* OEMs seemingly started to split the huge disk
drives into a couple of partition spaces (plus maintenance/repair and
recovery partitions), they often used a ridiculously large drive C volume
with a small 100 to 200GB drive D volume on a 1TB drive (or at best, a
"50/50 split" - still too damn big to mitigate the "Fragmentation Hell"
effect of lumping all the OS, apps and user data into a single 500GB disk
volume.

Windows allows a certain amount of space for each new file created. I
can't remember how much, and it might vary by version, but it as a
setting - probably in the registry.

If your files are always less than this size, and you have lots of space,
you will get no fragmentation. Even if the file starts very small and
grows to this size.

If you defrag your drive, the files are put cheek by jowl against each
other. Any growth of a file will result in fragmentation.

If you copy an existing file, Windows will attempt to place it in the
smallest contiguous space large enough for the file. Simply copying a
large, heavily fragmented file will defragment it if you have enough
contiguous free space.

The files that fragment badly are those that start small, and grow
allocation unit by allocation unit - interspersed with other files also
being written. Classics are various log files.

And it doesnt really matter if those log files do get quite
fragmented, because they are hardly ever read from end
to end except when browsing them, when you reading
much more slowly than the file can be read anyway,
so extra seeks between fragments dent matter at all.

Using an appropriate allocation unit size can have a huge impact. Few
people ever seem to set up dedicated volumes with a sensibly chosen AU
sizes for particular purposes. Of course, managing multiple volumes can be
a right pain...

--
Rod