:icons: font :source-highlighter: highlightjs :nofooter: :sectnums: = FlexArray -- Live Coding Workshop Implementation of an ArrayList with automatic growth. We do it manually at least once before starting to use `List`. == Project Setup Create a new project called `FlexArrays` in an _empty_ directory. [source,shell] ---- // install templates if not done already // dotnet new install HTLLeonding.Utility.Templates dotnet new leoconsole -n FlexArrays -o . ---- == Flexible array for `int` === Construction As usual, we start with a test. .FlexArrayIntTests.cs [source,csharp] ---- [Fact] public void Construction_NoSize() { var array = new FlexArrayInt(); array.Should().NotBeNull(); array.Count.Should().Be(0, "no items added yet"); array.Capacity.Should().Be(FlexArrayInt.DefaultStartSize, "default capacity"); } ---- To allow the application to compile we need the class `FlexArrayInt`. .FlexArrayInt.cs [source,csharp] ---- public sealed class FlexArrayInt { public const int DefaultStartSize = 4; private int[] _data; public FlexArrayInt(int? initialSize = null) <1> { var size = Math.Max(0, initialSize ?? DefaultStartSize); <2> _data = new int[size]; Count = 0; } public int Count { get; private set; } <3> public int Capacity => _data.Length; <4> } ---- <1> Providing an initial capacity is optional <2> Ensuring only valid values are passed <3> We will adjust that value every time an item is added or removed <4> The _current_ max. capacity of the list, will change when a _growth_ happens *Turn on continuous testing now!* Time for more tests. .FlexArrayIntTests.cs [source,csharp] ---- [Theory] [InlineData(0)] [InlineData(4)] [InlineData(20)] [InlineData(100_000)] public void Construction_WithSize(int size) { var array = new FlexArrayInt(size); array.Should().NotBeNull(); array.Count.Should().Be(0, "no items added yet"); array.Capacity.Should().Be(size, "initial capacity set to specified size"); } [Fact] public void Construction_WithInvalidSize() { var array = new FlexArrayInt(-5); array.Should().NotBeNull(); array.Count.Should().Be(0); array.Capacity.Should().Be(0, "negative size provided => set to 0"); } ---- === Adding items The list has been created, now we need to be able to add elements to it. .FlexArrayIntTests.cs [source,csharp] ---- [Fact] public void AddItem_Single() { var array = new FlexArrayInt(); array.Add(17); array.Count.Should().Be(1, "one item added"); array.Capacity.Should().Be(FlexArrayInt.DefaultStartSize, "capacity unchanged"); } ---- .FlexArrayInt.cs [source,csharp] ---- public void Add(int value) { if (Capacity == Count) <1> { Grow(); <1> } _data[Count++] = value; <2> } private void Grow() { // TODO } ---- <1> If we are out of free space we have to grow <2> Adding the element to the end of the list The basic functionality is showing 'green', but does our solution work when adding multiple items? .FlexArrayIntTests.cs [source,csharp] ---- [Fact] public void AddItem_Multiple_NoGrowth() { var array = new FlexArrayInt(); array.Add(17); array.Add(20); array.Add(-100); array.Count.Should().Be(3, "three items added"); array.Capacity.Should().Be(FlexArrayInt.DefaultStartSize, "no growth required yet"); } [Fact] public void AddItem_Multiple_Growth() { var array = new FlexArrayInt(); void AddItems(int count) <1> { for (var i = 0; i < count; i++) { array.Add(i + 1); } } AddItems(FlexArrayInt.DefaultStartSize); array.Capacity.Should().Be(FlexArrayInt.DefaultStartSize); AddItems(2); array.Capacity.Should().BeGreaterThan(FlexArrayInt.DefaultStartSize, "capacity exceeded, had to grow"); <2> array.Count.Should().Be(FlexArrayInt.DefaultStartSize + 2); } ---- <1> Using a local function to reduce code duplication <2> The list had to grow, but we do not constrain in the test how much it has grown -- that is an implementation detail and up to the list => we (usually) test expected _behaviour_ & _functionality_, not how it is accomplished The test is 'red' => we haven't implemented growth logic yet. .FlexArrayInt.cs [source,csharp] ---- private void Grow() { var newData = new int[Capacity * 2]; <1> Array.Copy(_data, newData, Count); <2> _data = newData; <3> } ---- <1> Out growth strategy is to double the size of the array every time -- the more items we add to the list the more space is initially 'wasted'. We could also apply logarithmic scale or a combination; varying scaling factors etc. -- profiling would be required to find a good solution. <2> Don't forget to copy the data from the old array or the values the user added will be lost <3> Replacing the reference to the new array -- the old one will be GC'd === Contains Next, we want to check if our list contains a certain element. .FlexArrayIntTests.cs [source,csharp] ---- [Fact] public void Contains_Success() { var array = new FlexArrayInt(); array.Add(12); array.Add(23); array.Add(-900); array.Add(76); array.Contains(23) .Should().BeTrue(); } [Fact] public void Contains_Success_AfterGrowth() { var array = new FlexArrayInt(); for (var i = 0; i <= FlexArrayInt.DefaultStartSize; i++) { array.Add(i); } array.Contains(FlexArrayInt.DefaultStartSize - 1) .Should().BeTrue(); } [Fact] public void Contains_Empty() { var array = new FlexArrayInt(); array.Contains(23) .Should().BeFalse("collection is empty"); } [Fact] public void Contains_NotFound() { var array = new FlexArrayInt(); array.Add(-12); array.Add(23); array.Contains(12) .Should().BeFalse("not contained in list"); } ---- .FlexArrayInt.cs [source,csharp] ---- public bool Contains(int value) { if (Count == 0) { return false; } for (var i = 0; i < Count; i++) { if (_data[i] == value) { return true; } } return false; } ---- === Accessing items by index We want to get items out of the list by index. This will also allow us to iterate it using a `for` loop. .FlexArrayIntTests.cs [source,csharp] ---- [Fact] public void Indexer_Simple() { var array = new FlexArrayInt(); array.Add(89); array[0].Should().Be(89); } [Fact] public void Indexer_Multiple() { var array = new FlexArrayInt(); for (var i = 0; i < 100; i++) { array.Add(i+1); } array[20].Should().Be(21); array[88].Should().Be(89); } [Theory] [InlineData(-1, "negative index")] [InlineData(2, "index too high")] [InlineData(10, "index too high")] public void Indexer_InvalidIndex(int index, string reason) { var array = new FlexArrayInt(); array.Add(-89); array[index].Should().Be(-1, reason); } [Fact] public void Iterate() { var array = new FlexArrayInt(); for (var i = 0; i < 10; i++) { array.Add(i); } for (var i = 0; i < 10; i++) { array[i].Should().Be(i); } } ---- .FlexArrayInt.cs [source,csharp] ---- public int this[int index] { get { if (index < 0 || index >= Count) { return -1; } return _data[index]; } } ---- === Removing items Finally, we need to be able to remove items from the list. We will provide two options: . Remove by index . Remove values .. First occurrence .. All occurrences The critical thing is, that we cannot allow 'gaps' in our backing array, so we'll have to _shift_ remaining values. This also highlights the primary weakness of an ArrayList. ==== `RemoveAt` .FlexArrayIntTests.cs [source,csharp] ---- [Fact] public void Remove_ByIndex_Simple() { var array = new FlexArrayInt(); array.Add(40); array.Count.Should().Be(1); array.RemoveAt(0).Should().BeTrue("index exists"); array.Count.Should().Be(0, "count has to be reduced"); } [Fact] public void Remove_ByIndex_Shifting() { var array = new FlexArrayInt(); array.Add(30); array.Add(31); array.Add(32); array.Count.Should().Be(3); array.RemoveAt(1).Should().BeTrue(); array.Count.Should().Be(2); array[1].Should().Be(32, "items to the right have to be shifted left after removal"); array.RemoveAt(2) .Should().BeFalse("not a valid index any longer, despite a value still being there"); } [Fact] public void Remove_ByIndex_FromEnd() { var array = new FlexArrayInt(); array.Add(30); array.Add(31); array.Add(32); array.Count.Should().Be(3); array.RemoveAt(2).Should().BeTrue(); array.Count.Should().Be(2); array[1].Should().Be(31); } [Fact] public void Remove_ByIndex_Invalid() { var array = new FlexArrayInt(); array.Add(30); array.RemoveAt(1) .Should().BeFalse("the array is big enough for this index, but no value has been set there"); array.Count.Should().Be(1, "unchanged"); array.RemoveAt(-1) .Should().BeFalse("negative index doesn't make any sense"); } ---- .FlexArrayInt.cs [source,csharp] ---- public bool RemoveAt(int index) { if (index < 0 || index >= Count) { return false; } if (index != Count - 1) { ShiftLeft(index); <1> } Count--; return true; } private void ShiftLeft(int fromIndex) { for (var i = fromIndex; i < Count - 1; i++) { _data[i] = _data[i + 1]; <2> } } ---- <1> If not removing at the end we have to shift to avoid gaps <2> Why don't we have to set the last index (to the right) to `null`? Didn't we just duplicate a value that is now in the list twice? No, because we reduce the `Count`, so the duplicated value is now 'out of scope'. ==== `Remove` .FlexArrayIntTests.cs [source,csharp] ---- [Fact] public void Remove_Single_OneOccurrence() { var array = new FlexArrayInt(); array.Add(1); array.Add(2); array.Add(3); array.Remove(2, true) .Should().BeTrue("element was found"); array.Count.Should().Be(2, "element was removed"); } [Fact] public void Remove_Single_OnlyOccurrence() { var array = new FlexArrayInt(); array.Add(1); array.Remove(1, true) .Should().BeTrue("element was found"); array.Count.Should().Be(0, "element was removed"); } [Fact] public void Remove_Single_MultipleOccurrence() { var array = new FlexArrayInt(); array.Add(1); array.Add(2); array.Add(2); array.Remove(2, true) .Should().BeTrue("element was found"); array.Count.Should().Be(2, "only one element was removed"); } [Fact] public void Remove_Single_NotFound() { var array = new FlexArrayInt(); array.Add(1); array.Add(2); array.Add(3); array.Remove(4, true) .Should().BeFalse("element was not found"); array.Count.Should().Be(3, "unchanged"); } [Fact] public void Remove_Multiple_OneOccurrence() { var array = new FlexArrayInt(); array.Add(1); array.Add(2); array.Add(3); array.Remove(2, false) .Should().BeTrue("element was found"); array.Count.Should().Be(2, "only one element was removed"); } [Fact] public void Remove_Multiple_MultipleOccurrence() { var array = new FlexArrayInt(); array.Add(1); array.Add(2); array.Add(2); array.Add(3); array.Add(2); array.Add(4); array.Add(2); array.Remove(2, false) .Should().BeTrue("elements were found"); array.Count.Should().Be(3, "four elements were removed"); } [Fact] public void Remove_Multiple_OnlyOccurrence() { var array = new FlexArrayInt(); array.Add(1); array.Add(1); array.Add(1); array.Remove(1, false) .Should().BeTrue("elements were found"); array.Count.Should().Be(0, "all elements were removed"); } [Fact] public void Remove_Multiple_NoOccurrence() { var array = new FlexArrayInt(); array.Add(1); array.Add(2); array.Add(3); array.Remove(4, false) .Should().BeFalse("element was not found"); array.Count.Should().Be(3, "unchanged"); } ---- .FlexArrayInt.cs [source,csharp] ---- public bool Remove(int value, bool firstOnly) { var removedAny = false; for (var i = 0; i < Count; i++) <3> { if (value != _data[i]) { continue; } RemoveAt(i--); <1><2><3> removedAny = true; if (firstOnly) { break; } } return removedAny; } ---- <1> Re-use existing functionality whenever possible <2> Why do we have to decrement the loop index here? <3> `RemoveAt` will modify `Count` -- but we need `Count` for our loop, why does that work? Could we use a `foreach` loop? NOTE: Why aren't we shrinking the array again if a sufficient amount of items has been removed? Because that would require yet another copy operation and 'wasting' some memory is usually more efficient than spending cycles. What many array list implementation do is to provide a dedicated `Trim` operation, so the user can actively decide when they want to spend the cycles to reduce the memory footprint. == Flexible array for `string` Implement the same functionality for `string` instead of `int`.