From: Rich P on 18 Feb 2010 13:34 Actually, I have two questions. I wrote a program which displays images in a slideshow type manner. The image files are .jpg and .bmp and .gif images. They are stored in a variety of subfolders under a parent folder. There are about 14,000 image files in these subfolders. I retrieve each image file as follows: List<string> myList = new List<string>(); foreach (string str1 in My.Computer.FileSystem.GetFiles("C:\\1A", FileIO.SearchOption.SearchAllSubDirectories, "*.jpg", "*.bmp", "*.gif")) { myList.Add(str1); } Currently, I read all the files into this list object and then display each image for 1 second and then display the next image, ... in a loop (it is basically a search for something by seeing it program) . The image files are named in the following manner: aaa1.jpg aaa2.jpg aaa3.jpg ... aaa100.jpg abcbbb1.jpg abcbbb2.jpg ... abcbbb77.jpg cccttttt1.jpg cccttttt2.jpg ... cccttttt142.jpg ddd1.bmp ddd2.jpg ddd3.gif ddd4.jpg ... ddd95.jpg ... and the subfolders are named fldA, fldB, fldC, ... fldZ where image files that begin with "a" will be stored in fldA, image files that begin with "b" will be stored in fldB, ... Basically, I have groups of image files where a group has the same beginning text in the filename (alpha chars) and then followed by a numeric char (incremented as aaa1, aaa2, aaa3, ...aaa100). So group aaa may have 100 image files that begin with "aaa" before encountering a numeric char, group "abcbbb" may have 77 files, ... What I want to do is this: when a group of images begins displaying - group "aaa" for example - I want to display the count of files in that group while that group of images is being displayed. I would have a label reading "Count of 'aaa' is 100". Then when the next group of images is displayed the label would change to "Count of 'abcbbb' is 77" and so on. I could like pick the max count for a given group or I could do a "Group By" type query on the current group of images being displayed. Then - for each group I would have to search the filename for the point at which the char becomes numeric and then find the max number value or do the "Group By" thing based on the alpha portion of the filenames. in pseudocode I would have something like this: class myGroup { //alpha part of filename in the group //count of files in this group string GroupName; int GroupCount; } string s1, s2; int Lcount = 0; //store just the group name in another list object List<myGroup> myGroups = new List<myGroup>(); myGroups = LinQ magic to get group - parsing out the number part of the group filenames from the 14,000 files in myList -- which may be only 200 individual groups of image files for (int i = 0; i < myGroups.Count, i++) { //now get the list of filenames for this group //more LinQ magic to get just the "aaa's" then the "abcbbb's", then the "bbb's", ... List<string> newFileList = new List<string>(); //get files from myList where the alpha portion of the filenames matches the current myGroups.GroupName LabelCount.Text = myGroups.GroupCount.ToString() + " files in " + myGroups.GroupName.ToString(); for (int j = 0; j < newfileList; j++) { //display image } } Question 1: Could LinQ do this? If yes - may I ask for an example how? Question 2: would it be more efficient to read the subfolders individually? Where I would just loop through each subfolder. Like subfolder fldA may store 1000 image files, fldM may have 3000 files, fldQ may have only 50 image files. Right now I am just reading everything into memory - all 14,000 filenames. would there be any performance/efficiency difference between reading everything in one chunk or reading the subfolders individually? Rich *** Sent via Developersdex http://www.developersdex.com ***
From: Peter Duniho on 18 Feb 2010 14:11 Rich P wrote: > Actually, I have two questions. > > I wrote a program which displays images in a slideshow type manner. The > image files are .jpg and .bmp and .gif images. They are stored in a > variety of subfolders under a parent folder. There are about 14,000 > image files in these subfolders. I retrieve each image file as follows: > > List<string> myList = new List<string>(); > foreach (string str1 in My.Computer.FileSystem.GetFiles("C:\\1A", > FileIO.SearchOption.SearchAllSubDirectories, > "*.jpg", "*.bmp", "*.gif")) > { > myList.Add(str1); > } You should rid your code of any VB references. There's really no need for them in C#, and doing things "the VB way" in C# will only slow you down in the long run. Use the System.IO.Directory class, and its GetFiles() method in particular, to obtain a list of files found at a path. Also note that the List<T> class has an AddRange() method. It is much more efficient, especially when adding a large number of items, to use that method instead of adding items individually. For the moment, I'll take for granted that storing in memory the names of 14,000 files all at once makes sense. But that seems potentially inefficient as well. :) So, as for the questions: > [...] > Question 1: Could LinQ do this? If yes - may I ask for an example how? LINQ certainly can group data. One question is, is there a particular order you need the groups to be presented in? And can you confirm that you do in fact want to display a given group of pictures together? Or is it simply that you want the count of pictures in a given group to be displayed with a given picture from that group? Assuming you have an enumeration of all the files, you can group them like this� char[] _rgchDigits = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' }; var grouped = from filename in myList group filename by Path .GetFilename(filename) .Substring(0, filename.IndexOfAny(_rgchDigits)); > Question 2: would it be more efficient to read the subfolders > individually? Where I would just loop through each subfolder. The most efficient thing would be for each group of files to be in their own folder. It's not feasible to try to retrieve individual groups from a single folder; to enumerate the files individually by group, you'd have to generate the group names and filter a file enumeration by that. You might as well in that case just get all the files for a folder and then group them. That said, certainly working on one folder at a time rather than trying to manage everything all at once could be more _memory_ efficient, if not performance efficient. User perception of performance could be better, simply because your program isn't trying to do so much all at once (the big performance hit being all the i/o involved in retrieving 14,000 file names from the directory structure all at once). Hope that helps. Pete
From: Rich P on 18 Feb 2010 15:41 Thank you for your reply. And "Enumeration" was the word I believe I was looking for to describer how I have these image files organized. When I read the files - they are alphabetic. I read all the A's first, then the B's, C's, ...Q's, W's, X's, Z's. Confession(bless me almightly one for mixing VB with C# :) I have been doing VB/VB.Net for several years and have been migrating to C# for the last couple of years. So I don't have all the C# stuff down yet. Question: My.Computer.FileSystem.GetFiles("C:\\1A", FileIO.SearchOption.SearchAllSubDirectories... this will search all the subdirectories. How do I search all subdirectories with System.IO.Directory class - GetFiles() ? Before My.Computer... I used to have recursive routine that would read each subfolder\subfolder... using Windows API's. It was pretty fast but way more lines of code than My.Computer... Question2: > char[] _rgchDigits = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' }; var grouped = from filename in myList group filename by Path .GetFilename(filename) .Substring(0, filename.IndexOfAny(_rgchDigits)); < lets say I set up a test scenario where I have a list of test text files in C:\1A\A, C:\1A\B, C:\1A\C in subfolder A I have the following test text files (no content) testA1.txt testA2.txt testA3.txt testAB1.txt testAB2.txt testAB3.txt testAB4.txt then in subfolder B I have testB1.txt testB2.txt testB3.txt testBC1.txt testBC2.txt testBC3.txt testBC4.txt and the same in subfolder C for the C's. I want to read all of these text file names into a list and group them by testA, testAB, testB, testBC, testC, testCD, and get a count of each group where group testA is count = 3, testAB is count = 4, testB count = 3, testBC count = 4, ... Using System.IO.Directory how can I read each subdirectory to populate my list of the test text file names? and how can I use linQ to group this list for something like the following? foreach (myGroupTestTxt grp in Result of LinQ Magic) console.WriteLine(groupName + " " + groupCount.ToString()); Rich *** Sent via Developersdex http://www.developersdex.com ***
From: Peter Duniho on 18 Feb 2010 18:12 Rich P wrote: > [...] > Question: > > My.Computer.FileSystem.GetFiles("C:\\1A", > FileIO.SearchOption.SearchAllSubDirectories... > > this will search all the subdirectories. How do I search all > subdirectories with System.IO.Directory class - GetFiles() ? See: http://msdn.microsoft.com/en-us/library/ms143316.aspx > [...] > Question2: > > char[] _rgchDigits = { '0', '1', '2', '3', '4', '5', '6', '7', '8', > '9' }; > > var grouped = from filename in myList > group filename by Path > .GetFilename(filename) > .Substring(0, filename.IndexOfAny(_rgchDigits)); > > [...] > I want to read all of these text file names into a list and group them > by testA, testAB, testB, testBC, testC, testCD, and get a count of each > group where group testA is count = 3, testAB is count = 4, testB count = > 3, testBC count = 4, ... > > Using System.IO.Directory how can I read each subdirectory to populate > my list of the test text file names? You can either enumerate files one directory at a time (see Directory.GetDirectories() for getting a list of directories in a directory), or see above for enumerating all files recursively under a given path. > and how can I use linQ to group > this list for something like the following? > > foreach (myGroupTestTxt grp in Result of LinQ Magic) > console.WriteLine(groupName + " " + groupCount.ToString()); Assuming the code I proposed: foreach (var group in grouped) { Console.WriteLine(group.Key + " " + group.Count.ToString()); } This stuff is all in the documentation. Given the code I proposed earlier, you could have even used VS's Intellisense to see what the query result was and figure out how to use it, but of course you could also have started with the Enumerable.GroupBy() method to see what it returns and followed the chain of class features from there. Pete
From: Rich P on 19 Feb 2010 12:11
As alwyas, thank you very much for your reply. I am now using System.IO.Directory for drilling into subdirectories -- very nice! And I am attempting to use the code sample you have proposed. But VS is complaining. Here is what I have attempted thus far: private void GroupFiles() { DirectoryInfo di = new DirectoryInfo(@"C:\1A\1AA\"); FileInfo[] files = di.GetFiles("*.txt", SearchOption.AllDirectories); List<string> myList = new List<string>(); foreach (FileInfo file in files) myList.Add(file.Name); foreach (string str1 in myList) Console.WriteLine(str1); /*only searching 1 dir for now -- here are my test files test1.txt test2.txt test3.txt test4.txt testA1.txt testA2.txt testA3.txt */ char[] _rgchDigits = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' }; var grouped = from filename in myList group filename myPath .GetFilename(filename) <<<--- VS complains here .Substring(0, filename.IndexOfAny(_rgchDigits)); } I appologize in advance for my ignorance on the subject of LinQ, but when I add your proposed code to the routine above - VS complains as noted. At this point in time I don't have enough experience/intuition to see what is missing or where to go next with the Linq part of the exercise. Any suggestions greatly appreciated on how I could list the count of groups of my test files -- like group "test" has count = 4, and group "testA" has count = 3. how do I proceed with Linq to obtain this information? Thanks again for all the help. Rich *** Sent via Developersdex http://www.developersdex.com *** |