Forum Index > Full Moon Saloon > Lots of Folders Bad for Performance? (Website Stuff)
 Reply to topic
Previous :: Next Topic
Author Message
Josh Journey
a.k.a Josh Lewis



Joined: 01 Nov 2007
Posts: 4830 | TRs | Pics
Josh Journey
a.k.a Josh Lewis
PostThu May 15, 2014 2:19 am 
This post is about folders within a web server: I've heard too many images in a single folder can cause performance issues, but does lots of folders create a performance issue? I'm running a website that creates a folder per image uploaded. Down the road I expect to get between 1 million and a few million photos uploaded which means 1-3 million folders. In each folder 6 images are stored with various sizes. If this is problematic one idea is to have one folder per album which on average could store between 30-90 literal images (the sizes force the number to be multiplied by 6). It's just an idea, what I really want to do is use the best practices for image storage. So my two options for storage are: site/images/folder-id/id-size-file-name.jpg (single folder per album) site/images/folder-id/photo-id/size-file-name.jpg (single folder per image) If anyone either knows something about this or knows someone who might, please let me know. I can't literally test this because once it's too late, my site will be as slow as a slug if it does not go over well.

Back to top Reply to topic Reply with quote Send private message
Cyclopath
Faster than light



Joined: 20 Mar 2012
Posts: 7694 | TRs | Pics
Location: Seattle
Cyclopath
Faster than light
PostThu May 15, 2014 8:05 am 
I don't know the answer to your question but a couple of things occur to me: (1) The answer is going to depending whether you're using Windows/IIS or Linux/Apache. (2) The amount of time it takes the OS to get the file you want is going to be a tiny fraction of the amount of time it'll take to transfer the image over the wire. Personally, I'd go with one folder per album just because it seems like it will be easier to manage.

Back to top Reply to topic Reply with quote Send private message
Riverside Laker
Member
Member


Joined: 12 Jan 2004
Posts: 2818 | TRs | Pics
Riverside Laker
Member
PostThu May 15, 2014 12:25 pm 
Wow, are there a million decent photos on this planet? It would take 50 eight-hour days to look at them 1 second at a time (with weekends off).

Back to top Reply to topic Reply with quote Send private message
Sore Feet
Member
Member


Joined: 16 Dec 2001
Posts: 6304 | TRs | Pics
Location: Out There, Somewhere
Sore Feet
Member
PostThu May 15, 2014 7:25 pm 
I can't say I've ever investigated best practices in regard to image storage, but the number of folder levels shouldn't affect performance at all. It may however affect your search placement, because Google's algorithms does take the number of directory tiers into account in its rankings (whether this applies specifically to images, I don't know). Is there a particular reason you need to do one image per folder, and not all images of a similar size in a folder dedicated to that size, for example (assuming you're aiming to have different sized versions): url.com/images/small/imgID-imgName-imgSize.jpg url.com/images/med/imgID-imgName-imgSize.jpg url.com/images/large/imgID-imgName-imgSize.jpg ...or something to that nature. Adding in the folder-id and photo-id levels seems superfluous and unnecessary to me. I would be much less concerned with the folder structure than I would about the file names themselves. If you're storing URL data in a database or in XML in any way, and something gets nuked to the point where you may need to reconstruct broken URLs, it'll be a hell of a lot easier to do if there is a consistent naming schema to the image files themselves - associating the image file name with the database record for the particular gallery, or trip report, or whatever is the key there, but how you do that isn't necessarily set in stone.

Back to top Reply to topic Reply with quote Send private message
Josh Journey
a.k.a Josh Lewis



Joined: 01 Nov 2007
Posts: 4830 | TRs | Pics
Josh Journey
a.k.a Josh Lewis
PostThu May 15, 2014 8:27 pm 
Cyclopath wrote:
(1) The answer is going to depending whether you're using Windows/IIS or Linux/Apache.
I'm using apache according to my web server info.
Riverside Laker wrote:
Wow, are there a million decent photos on this planet? It would take 50 eight-hour days to look at them 1 second at a time (with weekends off).
In this day in age, I'd say yes. As for the number, I figured this based on another famous mountaineering site that does not have bulk uploading. Mine already does which will make people much more inclined to upload photos.
Sore Feet wrote:
I can't say I've ever investigated best practices in regard to image storage, but the number of folder levels shouldn't affect performance at all.
Right, I don't plan on going too deep. 3 folders deep at best.
Sore Feet wrote:
Is there a particular reason you need to do one image per folder, and not all images of a similar size in a folder dedicated to that size, for example (assuming you're aiming to have different sized versions):
This is the current gallery structure that my gallery provider is using. They seem to think it's easier to manage this way. I disagree, however my biggest concern is performance. I don't want to lose speed for the sake of making things super tidy. Fortunately I'm a very influential person of that software. As for why not to have all one size (the structure you mentioned) in a single folder I hear this can have performance issues (imagine a million photos in the same folder). Flickr breaks it up per X amount of photos rather than storing all large photos in a single directory. Here is living proof:
Code:
http://farm8.staticflickr.com/7361/12129096953_ce1b8d5dfb_c.jpg
Sore Feet wrote:
Adding in the folder-id and photo-id levels seems superfluous and unnecessary to me.
Indeed not as pretty. But I don't know of a better way. I tried to convince them into having a folder per user, but they figured it could become too many for some users. But no matter what we need a way to have each image and it's folder to always be unique so that the system knows the difference.
Sore Feet wrote:
I would be much less concerned with the folder structure than I would about the file names themselves
An ID method would do the trick, but the developer doesn't seem as interested in this method. The url for the image is indeed being stored in the database.

Back to top Reply to topic Reply with quote Send private message
ejain
Member
Member


Joined: 27 Apr 2009
Posts: 1497 | TRs | Pics
Location: Seattle, WA
ejain
Member
PostFri May 16, 2014 12:11 pm 
Whatever you do, make sure that: - The URL for an image never changes. - The image a URL points to never changes. Note that URLs do not have to map 1:1 to the file system.

Back to top Reply to topic Reply with quote Send private message
Josh Journey
a.k.a Josh Lewis



Joined: 01 Nov 2007
Posts: 4830 | TRs | Pics
Josh Journey
a.k.a Josh Lewis
PostFri May 16, 2014 1:03 pm 
Thanks for the heads up, of course I know that would break the images. wink.gif Right now I have registering to my site 100% closed off, so if the few images that are uploaded break, it's not a big deal. I'm trying to achieve the best practices. So the lots of folders creating a performance issue is still a mystery. Couldn't find good answers on Google.

Back to top Reply to topic Reply with quote Send private message
christensent
Member
Member


Joined: 05 Nov 2011
Posts: 658 | TRs | Pics
christensent
Member
PostFri May 16, 2014 2:25 pm 
I've not heard of lots of files in a folder being bad. It can make windows explorer hurt trying to actually display the contents of the folder, and you might cry if you tried to get a directory listing in command line, but purely from a file system structural standpoint, I don't think there is anything wrong with it. Assuming what is meant is that it's hard for explorer to display the folder, then a million folders in a folder will probably be not quite, but almost as bad as a million files in a folder. For an OS to allow easiest navigation of a million files would be to have 1000 files in each of 1000 folders. Of course actual sorting logic will dictate having different distributions, but you get the idea.

Learning mountaineering: 10% technical knowledge, 90% learning how to eat
Back to top Reply to topic Reply with quote Send private message
touron
Member
Member


Joined: 15 Sep 2003
Posts: 10293 | TRs | Pics
Location: Plymouth Rock
touron
Member
PostFri May 16, 2014 6:03 pm 
Riverside Laker wrote:
Wow, are there a million decent photos on this planet? It would take 50 eight-hour days to look at them 1 second at a time (with weekends off).
Actually, it would take 50 four hour days if you use one eye for each photo, thus looking at two photos at a time. With the time saved, you could probably hike up Mt. Si and back, or ski up Rainier and back.

Touron is a nougat of Arabic origin made with almonds and honey or sugar, without which it would just not be Christmas in Spain.
Back to top Reply to topic Reply with quote Send private message
cairn builder
Member
Member


Joined: 19 Aug 2013
Posts: 854 | TRs | Pics
cairn builder
Member
PostFri May 16, 2014 7:13 pm 
Riverside Laker wrote:
Wow, are there a million decent photos on this planet? It would take 50 eight-hour days to look at them 1 second at a time (with weekends off).
Did you accidentally post this to the wrong thread? The relevance is not obvious.

Back to top Reply to topic Reply with quote Send private message
Sore Feet
Member
Member


Joined: 16 Dec 2001
Posts: 6304 | TRs | Pics
Location: Out There, Somewhere
Sore Feet
Member
PostSat May 17, 2014 12:30 pm 
Ok, so it sounds like you're using a Joomla (I believe that's the CMS you've discussed in the past) plugin or something like that which doesn't have a ton of flexibility in how you can register the data outside of the pre-defined paths. So in that regard I wouldn't worry at all about best practices regarding loading the images themselves. The best practice will be to minimize the amount of data which needs to be called for each associated record. Because your query should only be returning the database records for images associated with each particular report, your site should not be parsing a folder at all when loading an image, because you should be loading the direct URI of each image to the output page. If you have a static root directory structure, then you should just be storing something like "sub-directory1/sub-directory2/imgFileName.jpg" in a dedicated image database as a string (and then using a JOIN in your SQL when querying the primary database). As long as you have that full path stored and don't have to parse directories to find files with a certain data ID, then there should be absolutely no measurable performance hit.

Back to top Reply to topic Reply with quote Send private message
Josh Journey
a.k.a Josh Lewis



Joined: 01 Nov 2007
Posts: 4830 | TRs | Pics
Josh Journey
a.k.a Josh Lewis
PostSat May 17, 2014 1:01 pm 
Yup, I'm using Joomla. As for the extension, it's a component that is "heavy duty" but does a lot in a single extension instead of needing many. biggrin.gif Regarding the image paths, I can store my images outside of the images folder which is nice. Here is what it looks like inside my database:
3 Main Photo Tables
3 Main Photo Tables
jml_social_photos
jml_social_photos
jml_social_photos_meta
jml_social_photos_meta
jml_social_photos_tag
jml_social_photos_tag

Back to top Reply to topic Reply with quote Send private message
Sore Feet
Member
Member


Joined: 16 Dec 2001
Posts: 6304 | TRs | Pics
Location: Out There, Somewhere
Sore Feet
Member
PostSun May 18, 2014 10:24 am 
Looks like you're pretty much covered - excessively so even given all the meta data that's being stored in the DB. I wouldn't even waste time worrying about it at this point. That plugin is pretty clearly written to handle everything you need to be worrying about.

Back to top Reply to topic Reply with quote Send private message
Josh Journey
a.k.a Josh Lewis



Joined: 01 Nov 2007
Posts: 4830 | TRs | Pics
Josh Journey
a.k.a Josh Lewis
PostSun May 18, 2014 12:54 pm 
I admit that I am a little worried about the metadata in how it stores EXIF data not just per photo, but per EXIF data as seen in the screen shot above. I would think that it would store all the EXIF data in a single record per photo.

Back to top Reply to topic Reply with quote Send private message
   All times are GMT - 8 Hours
 Reply to topic
Forum Index > Full Moon Saloon > Lots of Folders Bad for Performance? (Website Stuff)
  Happy Birthday speyguy, Bandanabraids!
Jump to:   
Search this topic:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum