The Melbourne ADUG August meeting is on Monday Night, 18th August
John McDonald will be doing a presentation exploring
a bit further into Spring4d iterators and show how you too can get the
benefits of Spring4d iterators without major alterations to your code.
When: 6:00pm for a 6:15pm start
Where: At the Melbourne Men’s Shed, and on Zoom.
Zoom link will be up here shortly before the meeting starts.
I went to Claude .. and after a few tries at making something concise …
function ZipByValue<T>(const list1, list2: IList<T>): IEnumerable< TPair<T,T> >;
begin
var list2Copy: IList<T> := TCollections.CreateList<T>(list2); // Work with a copy
Result := list1
.Select< TPair<T,T> > (
function(const item: T): TPair<T,T>
begin
var idx := list2Copy.IndexOf(item);
if idx >= 0 then begin
result := TPair<T,T>.Create(item,item);
list2Copy.RemoveAt(idx); // Remove so it can't be matched again
end
else
result := TPair<T,T>.Create(item, Default(T));
end
)
.Concat (
list2Copy.Select< TPair<T,T> > (
function(const item: T): TPair<T, T>
begin
result := TPair<T, T>.Create(Default(T), item);
end
)
);
end;
I think this is incorrect (above) … ( I’m having an issue finding and using Select (!) … maybe I have the coroutine branch, and it’s different?? )
[ Edit : I was getting hung up on TEnumerable vs IEnumerable< T > ]
We want something in the general direction of :
lista := list1.Intersect(list2); -> make into TPair( x,x )
listb := list1.Exclude(list2); -> make into TPair( x,_ )
listc := list2.Exclude(list1); -> make into TPair( _,x )
Ok … the idea. No clue what the performance would be like, or if it can be called lazily …
(PS: really dislike the term ‘Select’ instead of ‘Transform’. Blame C# and LINQ for that one.)
{$APPTYPE CONSOLE}
program Project2;
uses
spring,
spring.Collections;
type ints = TPair<integer,integer>;
begin
var list1 := TCollections.CreateList<integer>([1,2,3,4,5,6]);
var list2 := TCollections.CreateList<integer>([2,4,6,8,10,12]);
var lista := list1.Intersect(list2);
var listb := list1.Exclude(list2);
var listc := list2.Exclude(list1);
for var i in listb do writeln(i); writeln;
for var i in listc do writeln(i); writeln;
var result := TEnumerable.Select<integer,Ints>(
lista,
function(const x: integer): Ints
begin
result := Ints.Create(x,x);
end
)
.Concat (
TEnumerable.Select<integer,Ints>(
listb,
function(const x: integer): Ints
begin
result := Ints.Create(x,0);
end
)
)
.Concat (
TEnumerable.Select<integer,Ints>(
listc,
function(const x: integer): Ints
begin
result := Ints.Create(0,x);
end
)
);
for var i in result do writeln( i.Key:3,' ',i.Value:3 );
readln;
end.
Thanks Paul for posting that. It appears to work ok and is surprisingly fast given what it is doing. I would have expected the Intersect and Exclude methods to have slowed the code more.
I’ll do some more experiments with this, probably on the weekend.
I’m still kinda a Spring newbie, so there may well be better approaches.
A little update.
I have updated the project a little to keep an enum (TFileMatch) .. it makes the using / calling code cleaner.
I realise that when I create the data-structure for a file that exists in both versions, it doesn’t store the two versions. This needs a bit of thought.
I added a second version that does keep the information of both branches of the diff.
I don’t know that I’d call it an elegant solution to the problem, 'tho.
I’ve done some more testing with your code. Your earlier version takes about 3 times as long as my code for about 400,000 files (40 seconds compared with 14 seconds).
Your code was faster than I expected, given the intersect and exclude calls. I don’t understand how the spring code for Intersect and Exclude works. It does call TFileInfoRPComp.GetHashCode. It looks like it creates some sort of tree to get fast access, but I don’t have time to analyse what it’s doing.
Your earlier code only has the file info for one side of the match. Your later version gets the file info for the other side, but this increases elapsed time for 400,000 files from 40 seconds to 54 minutes.
It would be possible to use TDirectory.GetFiles to get two lists of relative paths, then do the intersect and excludes on these lists. Then use FindFirst etc to get the file info for one or both sides as part of the Selector methods in the three select statements. That should perform ok but it might take some tweaking.
I’m not sure that this would end up being a more elegant solution than mine.
I will put my code up on GitHub, but I want to test it more thoroughly first.
Is minutes correct? I did do a test on a C:\Users vs D:\Users with a lot of files and dirs. All the significant time was spent in the directory gathering calls.
I meant to go back and try the multimap instead of a zip.
I ran the tests again, with the only changes to your code being:
Change the paths read.
Disabled the three writeln statements - I don’t want to include the time it takes to write to the console.
Increment three counters during the iteration; MatchCount, DeleteCount, AddCount.
Add code to measure the elapsed time for sections of code.
Write out the elapsed times and the counters.
Results were:
32 bit Debug.
Time to create the two snapshots = 0:37.363
Time to call ZipDirSnapshots = 0:00.000
– this takes almost no time because it is just setting up the Enumerables. None of the work is done here.
Total Elapsed Time = 53:11.862
So the time to iterate through comparison is 52 minutes 35 seconds.
– this is where all the work of the Intersect and Exclude statements and the snap2.FirstOrDefault calls takes place.
Counts were: Match 192828, Add 196994, Del 9062.
64 bit Release
Time to create the two snapshots = 0:35.047
Time to call ZipDirSnapshots Elapsed Time = 0:00.000
Total Elapsed Time = 51:19.963
(time to iterate through comparison = 50 minutes 45 seconds).
Counts: Match 192812, Add 196998, Del 9078.
{$APPTYPE CONSOLE}
program Proj_Comp_Dirs4;
uses
spring.Collections,
U_DirSnapshot4 in 'U_DirSnapshot4.pas';
procedure CompareDirectoriesWithMultiMap;
var
matched,removed,added : integer;
begin
var multiMap := TCollections.CreateMultiMap<string, TFileInfo>;
var snapshot1 := TDirectorySnapshot.Create('c:\users', left,multimap);
var snapshot2 := TDirectorySnapshot.Create('c:\users',right,multimap);
writeln('3*');
for var key in multiMap.Keys.Distinct do
begin
var values := multiMap.Items[key]; // Gets all values for this key
case values.Count of
1: case values.First.Side of
left : begin
inc(removed);
//writeln('REMOVED: ', values.First.RelativePath);
end;
right: begin
inc(added);
//writeln('ADDED : ', values.First.RelativePath);
end;
end;
2: begin
inc(matched);
//writeln('MATCHED: ', values.First.RelativePath);
end;
end;
//writeln('Key: ',key, ' has ',values.Count,' values');
end;
writeln('m: ',matched,' r: ',removed,' a: ',added);
snapshot1.Free;
snapshot2.Free;
end;
begin
CompareDirectoriesWithMultiMap;
readln;
end.