As part of my self-improvement challenge, I have been watching the Introduction to Prism course from Pluralsight. I chose this course so I am better equipped for my team’s Prism application project at work where I was recently tasked to improve the startup performance.
At this time, the project contains sixty-nine IModule implementation types; however, that number is continuing to grow. All of these modules will not be loaded at once and some of them may not be used/loaded at all. Some of them are conditionally loaded during runtime when certain criteria are met.
While watching the Initializing Modules video I found myself wondering if anything would change if I were to change these conditionally loaded modules InitializationMode from the default WhenReady to OnDemand. My reasoning behind this is because Brian Lagunas explains that WhenReady initializes modules as soon as possible or OnDemand when the application needs them in the video. Brian recommends using OnDemand if the module is not required to run, is not always used, and/or is rarely used.
I have a few concerns:
Impacting features because the module is not loaded beforehand or the Module initialization is not done manually.
No performance impact because this project handles module initialization itself to parallelize it instead of letting Prism manage it.
In the end, only benchmarking each option will provide information to make a decision. To do this I used JetBrains dotTrace, focusing on the timings for App.OnStartup, Bootstrapper.Run, Bootstrapper.ConfigureModuleCatalog, and Bootstrapper.InitializeModules. Since we try to load modules in parallel, I ended up adding the timing for this as well - otherwise, the timing may have appeared off.
Baseline - InitializationMode.WhenAvailable
The first step was to gather baseline metrics.
Profile #1
Profile #2
Profile #3
Profile #4
Profile #5
Min
Average
Median
Max
STD
App.OnStartup
5845
4687
4220
4545
4973
4220
4854
4687
5845
551.6462635
Bootstrapper.Run
5954
3986
2598
3293
2779
2598
3722
3293
5954
1215.581013
Bootstrapper.ConfigureModuleCatalog
1148
767
363
511
1.5
1.5
558.1
511
1148
385.1511911
Bootstrapper.InitializeModules
184
109
117
85
71
71
113.2
109
184
39.0404918
Asynchronous Module Initialization
1821
2233
2311
2571
2564
1821
2300
2311
2571
274.6590614
Not terrible, but not ideal. The application splash screen is displayed for about 4.5 seconds on average on a developer machine with only a few conditional modules enabled.
InitializationMode.OnDemand
With the baseline determined, a comparison can be made when switching the modules to be loaded OnDemand.
Profile #1
Profile #2
Profile #3
Profile #4
Profile #5
Min
Average
Median
Max
STD
App.OnStartup
5419
3969
4391
5919
5490
3969
5037.6
5419
5919
733.0750575
Bootstrapper.Run
2770
2197
2017
2086
2238
2017
2261.6
2197
2770
266.0320281
Bootstrapper.ConfigureModuleCatalog
408
374
340
352
388
340
372.4
374
408
24.40983408
Bootstrapper.InitializeModules
143
67
69
69
66
66
82.8
69
143
30.1224169
Asynchronous Module Initialization
1926
1639
1699
1603
1632
1603
1699.8
1639
1926
117.3292802
All the Bootstrapper methods seemed to have improved, but overall the App.OnStartup took approximately the same amount of time.
Summary
There was an impact, but not in the overall startup time - which I find a little peculiar. It seems as though the overhead may have been shifted elsewhere in the startup process.
This may mean a hybrid approach to Bootstrapper.InitializeModules does have merits although not as much as I had hoped. Another option may be to change the Bootstrapper.ConfigureModuleCatalog to conditionally determine to add modules instead of applying a ‘safe’ default. Or perhaps I am diagnosing the wrong problem and should at other options - such as switching Dependency Injection frameworks.
In any case, I am going to discuss this as an option with my team - and see if additional testing can be done with more conditional modules enabled.
If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6, and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below 1000.
Simple Solution
The simplest solution is to iterate over all numbers up to the limit. If any of these numbers is a multiple of one of the factors then it is included in the summation.
for (int i = 0; i < limit; i++) { foreach (int factor in factors) { if (i % factor == 0) { sum += (ulong)i; break; } } }
return sum; }
Timing the operation yields:
Minimum Elapsed: 00:00:00.0000114 Average Elapsed: 00:00:00.0000545 Maximum Elapsed: 00:00:00.0004331
Pretty quick, but that is most likely because the problem space is small. If the limit is increased, or if more factors are introduced the number of operations performed is increased. In big-o notation, this approach is $$ O(n * m) $$.
Asynchronous Simple Solution
Note, this particular approach is not recommended for the problem as it is laid out in the description but is included to compare results and as a thought experiment. This solution could be viable if the number of factors used increased and there was a mechanism to reduce the amount of duplicated iterative work performed.
Another possible way to structure a solution to the problem is by giving each factor provided a thread to calculate the multiples up to the limit it has. Once each thread has been completed, the resulting multiples are compared for matching numbers that are used to generate a sum:
IList<Task<ICollection<int>>> taskCollection = new List<Task<ICollection<int>>>();
foreach (int factor in factors) { taskCollection.Add(GetFactorMultiplesBelowLimitAsync(limit, factor)); }
await Task.WhenAll(taskCollection);
ICollection<int> factorMultiples = new HashSet<int>(await taskCollection.First());
for (int i = 1; i < taskCollection.Count; i++) { ICollection<int> factorMultiplesResults = await taskCollection[i]; foreach (int factorMultiple in factorMultiplesResults) { factorMultiples.Add(factorMultiple); } }
return factorMultiples.Sum(); }
The iterative work for this solution was extracted to a helper method to parallelize it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
publicstaticasync Task<ICollection<int>> GetFactorMultiplesBelowLimitAsync(int limit, int factor) { ICollection<int> factorMultiples = new HashSet<int>();
for (int i = 0; i < limit; i++) { if (i % factor == 0) { factorMultiples.Add(i); } }
return factorMultiples; }
This is not the prettiest solution by any stretch and yields slightly worse results than the Simple Solution:
Minimum Elapsed: 00:00:00.0000317 Average Elapsed: 00:00:00.0002025 Maximum Elapsed: 00:00:00.0017206
The results are not surprising because of how the work is performed. There are two or more loops iterating over all of the numbers - duplicating the number of operations and comparisons that must be done.
One possible improvement for this solution could be to have a single shared collection of possible numbers that would be updated to reduce the number of iterations that are performed by each thread instead of joining the results after they have all been completed. This could also introduce a race condition so a thread-safe data structure is recommended if this approach is taken.
Another possible improvement would be to start with the largest factor from the available factors so that the initial set of numbers is the smallest starting point that it could be to iterate over.
As stated before, this is not a recommended solution when the number of factors is small since the iterative work is duplicated. The only redeeming factor of this approach is that the work is done on multiple threads so if two threads are available at the same time the solution may be the same as the Synchronous Simple Solution. This can be seen in the Minimum Elapsed Time measurement being comparable to the Average Elapsed Time in the previous results.
Simple LINQ Solution
The Simple Solution can be converted into a more fluent LINQ syntax - at the cost of some performance:
The type cast is necessary to convert the Sum() operation to the return type. This could cause a little performance degradation but was not significant in the measurements.
This solution reads a little easier to read and possibly understand since it reads like a sentence.
The results of this solution are:
Minimum Elapsed: 00:00:00.0000574 Average Elapsed: 00:00:00.0001152 Maximum Elapsed: 00:00:00.0006285
As expected, this is slower than the simple solution but still fast. The tradeoff could be worth it for the improved readability.
Surprisingly, this solution is slower than the asynchronous solution in some scenarios - seen by comparing the Minimum Elapsed Time of the two results.
Algorithmic Solution
The simple solution satisfies the criteria to generate an answer, but the performance can be improved by looking for an algorithmic solution instead of a brute-force solution. Conveniently, the problem is asking for a solution to a Finite Arithmetic Progression, specifically an Arithmetic Series which has an algorithmic solution.
… a sequence of numbers such that the difference between the consecutive terms is constant. … The sum of the members of a finite arithmetic progression is called an arithmetic series. … [The] sum can be found quickly by taking the number n of terms being added, multiplying by the sum of the first and last number in the progression, and dividing by 2:
With this algorithm, the sum for each specified multiple can be calculated. Keep in mind that all shared multiples for all factors must be subtracted. In the problem description this would be $$ 5 * 3 = 15 $$ if the limit was larger than the multiple (15 in this case).
Synchronous Algorithmic Solution
A synchronous solution for this problem with only two factors could look something like this:
int sum = 0; ICollection<int> factorLookup = new HashSet<int>(factors); ICollection<int> multiples = new HashSet<int>();
for(int i = 0; i < factors.Length; i++) { int factor = factors[i]; sum += AlgorithmicSumFactorBelowLimit(limit, factor);
for (int j = i + 1; j < factors.Length; j++) { int multiple = factor * factors[j];
if (!factorLookup.Contains(multiple) && limit > multiple && !multiples.Contains(multiple)) { multiples.Add(multiple); sum -= AlgorithmicSumFactorBelowLimit(limit, multiple); } } }
return sum; }
The AlgorithmicSumFactorBelowLimit method looks like this:
1 2 3 4 5 6 7 8
privatestaticintAlgorithmicSumFactorBelowLimit(int limit, int factor) { int n = (limit - 1) / factor; int a1 = factor; int an = n * a1;
return n * (a1 + an) / 2; }
The limit is subtracted by one when calculating $$ n $$ so that factors that evenly divide the limit do not generate an off-by-one error.
The performance of this solution is:
Minimum Elapsed: 00:00:00.0000007 Average Elapsed: 00:00:00.0000011 Maximum Elapsed: 00:00:00.0000045
Initially, I had the algorithmic method asynchronous to share code but wanted to ensure there was not any skewing of results that may have occurred from .GetAwaiter().GetResult(). Spoiler alert, the results were approximately the same in both - meaning there probably would not have been any perceptible difference in the results.
int sum = 0; ICollection<int> factorLookup = new HashSet<int>(factors); ICollection<int> multiples = new HashSet<int>();
for (int i = 0; i < factors.Length; i++) { int factor = factors[i]; sum += AlgorithmicSumFactorBelowLimit(limit, factor);
for (int j = i + 1; j < factors.Length; j++) { int multiple = factor * factors[j];
if (!factorLookup.Contains(multiple) && limit > multiple && !multiples.Contains(multiple)) { multiples.Add(multiple); sum -= AlgorithmicSumFactorBelowLimit(limit, multiple); } } }
return sum; }
And updating the algorithm to be asynchronous as well:
1 2 3 4 5 6 7 8
privatestaticasync Task<int> AlgorithmicSumFactorBelowLimitAsync(int limit, int factor) { int n = (limit - 1) / factor; int a1 = factor; int an = n * a1;
return n * (a1 + an) / 2; }
Measuring the performance of this solution generated:
Minimum Elapsed: 00:00:00.0000006 Average Elapsed: 00:00:00.0000010 Maximum Elapsed: 00:00:00.0000022
In this case, the asynchronous solution impacts the performance results positively because each thread can contribute to solving the problem without duplicating any of the work.
Because each thread can do work in isolation, this solution will scale well even as the number of factors increases - as long as there are threads available to do the processing work.
My work inherited an ASP.NET WebApi Project from a contracted company. One of the first things added to the project was a Dependency Injection framework. DryIoc was selected for its speed in resolving dependencies.
In this post, I show why and how reflection was utilized to improve the Dependency Injection Registration code.
Architecture
First, let me provide a high-level overview of the architecture (using sample class names). The project is layered in the following way:
Controllers
Controllers are the entry-point for the code (just like all WebApi applications) and are structured something like this:
[HttpGet] [Route("{blogPostId}")] [ResponseType(typeof(BlogPost))] publicasync Task<IHttpActionResult> GetBlogPost(int blogPostId) { // Logging logic to see what was provided to the Controller method.
if (blogPostId <= default(int)) { returnthis.BadRequest("Invalid Blog Post Id."); }
Controllers are responsible for ensuring sane input values are provided before passing the parameters through to the Manager layer. In doing so, the business logic is pushed downwards allowing for more re-use.
The ProjectControllerBase provides instrumentation logging and as a catch-all for any errors that may occur during execution:
My goal is to refactor this at some point to remove the ILogger dependency.
Managers
Managers perform more refined validation and contain the business logic for the application. This allows for the business logic to be referenced by a variety of front-end applications (website, API, desktop application) easily.
Repositories act as the data access component for the project.
Initially, Entity Framework was used exclusively for data access. However; for read operations, Entity Framework is being phased for Dapper due to performance issues.
Entity Framework automatically uses an ORDER BY clause to ensure results are grouped. In some cases, this caused queries to time out. Often, this is a sign that the data model needs to be improved and/or that the SQL queries were too large (joining too many tables).
Additionally, our Database Administrators wanted read operations to use WITH (NOLOCK).
To the best of our knowledge, a QueryInterceptor would need to be used. This seemed to be counter-intuitive and our aggressive timeline would not allow for any time to tweak and experiment with the Entity Framework code.
For insert operations, Entity Framework is preferred.
publicasync Task<BlogPostEntity> GetBlogPost(int blogPostId) { // Logging logic to see what was provided to the Repository method.
DynamicParameters sqlParameters = new DynamicParameters(); sqlParameters.Add(nameof(blogPostId), blogPostId);
StringBuilder sqlBuilder = new StringBuilder() .AppendFormat( @"SELECT * -- Wildcard would not be used in actual code. FROM blog_posts WITH (NOLOCK) WHERE blog_posts.blog_post_id = @{0}", nameof(blogPostId));
using (SqlConnection sqlConnection = new SqlConnection(this.databaseConnectionString)) { await sqlConnection.OpenAsync();
// Logging logic to time the query. BlogPostEntity blogPostEntity = await sqlConnection.QueryFirstOrDefaultAsync( sqlBuilder.ToString(), sqlParameters);
Problems with this would occasionally arise when a developer introduced new Manager or Repository classes but did not remember to register instances of those classes with the Dependency Injection container. When this occurred, the compilation and deployment would succeed; but the following runtime error would be thrown when the required dependencies could not be resolved:
An error occurred when trying to create a controller of type ‘BlogPostController’. Make sure that the controller has a parameterless public constructor.
The generated error message does not help identify the underlying issue.
To prevent this from occurring, all Manager and Repository classes would need to automatically register themselves during start-up.
Reflection
To automatically register classes, reflection can be utilized to iterate over the assembly types and register all Manager and Repository implementations. Initially, this was done by loading the assembly containing the types directly from the disk:
foreach (Type exportedType in dependencyAssembly.GetExportedTypes()) { // Skip registering items that are an interface or abstract class since it is // not known if there is an implementation defined in this assembly. if (DependencyInjectionConfiguration.IsInterfaceOrAbstractClass(exportedType)) { continue; }
// Skip registering items that are not a Manager, or Repository. if (DependencyInjectionConfiguration.IsNotManager(exportedType) && DependencyInjectionConfiguration.IsNotRepository(exportedType)) { continue; }
While this works, it felt wrong to load the assembly from disk using a hard-coded path; especially when the assembly will be loaded by the framework automatically. To account for this, the code was modified in the following manner:
foreach (Type exportedType in dependencyAssembly.GetExportedTypes()) { // Skip registering items that are an interface or abstract class since it is // not known if there is an implementation defined in this assembly. if (DependencyInjectionConfiguration.IsInterfaceOrAbstractClass(exportedType)) { continue; }
// Skip registering items that are not a Manager, or Repository. if (DependencyInjectionConfiguration.IsNotManager(exportedType) && DependencyInjectionConfiguration.IsNotRepository(exportedType)) { continue; }
Unfortunately, there are no timing metrics available for measuring if there are any performance improvements for either implementation. With that said, the second implementation seems faster. This may be because the assembly is already loaded due to other registrations that occur before the reflection registration code is executed. For this reason, results may vary from project to project.
Overall, the solution works well and has limited the runtime error appearing only when a new Entity Framework context is added to the project.
One of my brothers gifted Terraria to me on Steam. Needless to say, instead of doing the things that I ought to be doing; I have been instead playing it far more frequently. At least until my world save became corrupted.
While I was playing, Terraria suddenly crashed. I figure it was in the middle of a save/backup because when I tried to load the game back up Terraria informed me that my save was corrupted (I am not sure what caused it either). I shrugged it off thinking “No big deal, I will just use the backup file.” As it turned out, the backup was not as recent as I would have liked.
I had just gotten through a particularly nasty section of sand and had no desire to repeat it. Logically, the next thing to try was to open the world save in TEdit to see if there was anything salvageable. To my surprise, I opened the world save with TEdit and was informed that TEdit could try to recover it. Cool! After it loaded the map, nothing seemed out of place, so I saved it and went on with my business of exploring.
It was not until I had an inventory full of items that I realized what TEdit had not been able to recover - chest data. All the chests in my player home were now empty. Curious, I checked other chests using TEdit. All of the chests I checked were empty too! I would later find out that about 100 of the chests were now empty.
That would not do, that was about a third of the chests in my world. Half the fun of exploring and finding a chest is the goodies you get inside it. The only option I could think of at the time was to create a new world, maybe even using the same world seed to get the same map. However, I was not sure (still am not sure either) whether chest data is randomly generated, which could mean that even using the same world seed would not result in the same items. To top it off, it would have been easier to go through the section of sand that I was trying to avoid. So the only real option was to see if I could repair the world save to get the chest items back.
Fortunately, I had been backing my save files up to Google Drive. Without these backups, there would have been no way to restore the chest data. That does not mean it was a breeze though, there were a few hiccups along the way; most of them involving the sheer amount of data included in the world save. In the end, I was able to restore my game save to a state that I consider to be close enough to where it was that I do not miss anything. Probably, I am still missing a few items, but nothing I have noticed so I do not feel like I lost anything.
Researching the Terraria Source Code
The first step of the process required parsing world save data. I had to find the chest data in the good backup save to get the list of items that I was missing. Luckily, the Terraria client is fairly easy to decompile (NOTE: The decompiled source code will not run without modifications). There could be legal implications for decompiling the Terraria client - I do not know, I did not read the End User License Agreement. Instead, I used a repository that someone else had posted from dotPeek. Using the decompiled source code I could replicate how the game client reads the world save data and compare the chests from two of my world saves.
The code I was looking for is located in Terraria/IO/WorldFile.cs and begins in loadWorld. loadWorld does some date checking for special events, checks if the file exists, and then reads some data to determine how to parse the rest of the data. Depending on this value, the code is directed to either WorldFile.LoadWorld_Version1 or WorldFile.LoadWorld_Version2, since I know that my world file is fairly recent I immediately continued to WorldFile.LoadWorld_Version2.
publicstaticintLoadWorld_Version2(BinaryReader reader) { reader.BaseStream.Position = 0L; bool[] importance; int[] positions; if (!WorldFile .LoadFileFormatHeader(reader, out importance, out positions) || reader.BaseStream.Position != (long) positions[0]) return5; WorldFile.LoadHeader(reader); if (reader.BaseStream.Position != (long) positions[1]) return5; WorldFile.LoadWorldTiles(reader, importance); if (reader.BaseStream.Position != (long) positions[2]) return5; WorldFile.LoadChests(reader); if (reader.BaseStream.Position != (long) positions[3]) return5; WorldFile.LoadSigns(reader); if (reader.BaseStream.Position != (long) positions[4]) return5; WorldFile.LoadNPCs(reader); if (reader.BaseStream.Position != (long) positions[5]) return5; if (WorldFile.versionNumber >= 116) { if (WorldFile.versionNumber < 122) { WorldFile.LoadDummies(reader); if (reader.BaseStream.Position != (long) positions[6]) return5; } else { WorldFile.LoadTileEntities(reader); if (reader.BaseStream.Position != (long) positions[6]) return5; } } if (WorldFile.versionNumber >= 170) { WorldFile.LoadWeightedPressurePlates(reader); if (reader.BaseStream.Position != (long) positions[7]) return5; } if (WorldFile.versionNumber >= 189) { WorldFile.LoadTownManager(reader); if (reader.BaseStream.Position != (long) positions[8]) return5; } return WorldFile.LoadFooter(reader); }
The WorldFile.LoadWorld_Version2 function provided me with a layout of the different sections in a world save. The data appeared to be broken out into the following sections which are read sequentially from the world save:
File Format Header
Header
World Tiles
Chests
Signs
NPCs
Tile Entities
Weighted Pressure Plates
Town Manager
Footer
Wow! That is more data than I thought there would be. It appears as though after every section a check is done to verify that the current position in the data file matches a position that is read from the WorldFile.LoadFileFormatHeader. I should be able to use the position at the corresponding index to jump directly to the Chest data section.
The WorldFile.LoadFileFormatHeader is responsible for reading the WorldFileMetadata, the sections, and an array of items known as ‘importance’. Each section position is represented as a single integer value. I am familiar with this kind of data storage technique since HTTP packets do something similar:
Using the section list from the WorldFile.LoadWorld_Version2 and the position data from the WorldFile.LoadFileFormatHeader, I could read the Chest data immediately by jumping to that position in the save file. The next question was to determine how the Chest data was stored.
privatestaticvoidLoadChests(BinaryReader reader) { int num1 = (int) reader.ReadInt16(); int num2 = (int) reader.ReadInt16(); int num3; int num4; if (num2 < 40) { num3 = num2; num4 = 0; } else { num3 = 40; num4 = num2 - 40; } int index1; for (index1 = 0; index1 < num1; ++index1) { Chest chest = new Chest(false); chest.x = reader.ReadInt32(); chest.y = reader.ReadInt32(); chest.name = reader.ReadString(); for (int index2 = 0; index2 < num3; ++index2) { short num5 = reader.ReadInt16(); Item obj = new Item(); if ((int) num5 > 0) { obj.netDefaults(reader.ReadInt32()); obj.stack = (int) num5; obj.Prefix((int) reader.ReadByte()); } elseif ((int) num5 < 0) { obj.netDefaults(reader.ReadInt32()); obj.Prefix((int) reader.ReadByte()); obj.stack = 1; } chest.item[index2] = obj; } for (int index2 = 0; index2 < num4; ++index2) { if ((int) reader.ReadInt16() > 0) { reader.ReadInt32(); int num5 = (int) reader.ReadByte(); } } Main.chest[index1] = chest; } List<Point16> point16List = new List<Point16>(); for (int index2 = 0; index2 < index1; ++index2) { if (Main.chest[index2] != null) { Point16 point16 = new Point16( Main.chest[index2].x, Main.chest[index2].y); if (point16List.Contains(point16)) Main.chest[index2] = (Chest) null; else point16List.Add(point16); } } for (; index1 < 1000; ++index1) Main.chest[index1] = (Chest) null; if (WorldFile.versionNumber >= 115) return; WorldFile.FixDresserChests(); }
Not quite as straight-forward to decipher, but still doable:
num1 represents the number of chests stored in the world.
num2 represents the number of items stored in the chest.
num3 represents the items the chest is holding.
num4 represents the overflow of items if the chest had more items than the maximum.
Each chest then has the following properties:
x represents the x-coordinate of the chest in the world.
y represents the y-coordinate of the chest in the world.
name represents the name of the chest in the world.
I found it odd that the Chest data did not have an Id property that could be used to identify specific chests. Though the x and y properties are probably sufficient in most cases, it just meant that I would have to be more careful about identifying chests.
Depending on the next value a chest can have between 0 and 40 items that have the following properties:
id represents the item id
stack represents the quantity of that item in the single item slot.
prefix represents a prefix value that affects the stats on the item.
The code deviates a little from the normal layout with items. Instead of having an Int16 represent the number of items in a chest, each slot is read. If the slot is empty a zero will be placed in that Item data location and the code must account for that. This is caused by items in chests being located anywhere within the 40 slots. By storing the data this way, the game can reduce the size of the world save file.
After reviewing WorldFile.LoadChests, I had enough information to parse the chest data.
Applying the Research
In an effort NOT to confuse myself, I tried to break the sections out a little further into methods. The code was written using LINQPad to rapidly develop the prototype for reading the data. Main is the entry point for the ‘script’. The final code can be found in Appendix A.
1 2 3 4 5 6 7 8 9 10 11 12 13
voidMain() { World GoodWorld = GetWorld(@"201708310826.wld"); }
World GetWorld(string worldPath) { using (BinaryReader worldReader = new BinaryReader(File.OpenRead(worldPath))) { return GetWorld(worldReader); } }
GetWorld takes a string file path where the world file is located. This would allow me to add a single line to get the data from BadWorld once I was satisfied that the current implementation worked as intended. Once the file was opened for reading, the data needed to be read:
1 2 3 4 5 6 7 8 9 10 11 12 13
World GetWorld(BinaryReader worldReader) { World world = new World();
The File MetaData is read to get the BinaryReader to the correct position. This could be done with math - 2x Int32 (8 bytes), 2x Int64 (16 bytes) would yield 24 bytes. However, to maintain readability instead of jumping to an arbitrary location, reading the MetaData seemed more appropriate. Especially since there is an unknown value and the structure of this location in the file could be updated or changed between versions. An exception to this ‘rule’ is made once the section data has been read since it is more obvious why and where the jump is occurring:
for (int i = 0; i < world.TotalChests; i++) { world.ChestCollection[i] = GetChestData(worldReader, itemsPerChest, overflowItems); } }
Chest GetChestData( BinaryReader worldReader, int itemsPerChest, int overflowItems) { Chest chest = new Chest(itemsPerChest) { X = worldReader.ReadInt32(), Y = worldReader.ReadInt32(), Name = worldReader.ReadString(), };
for (int i = 0; i < itemsPerChest; i++) { chest.ItemCollection[i] = GetItemData(worldReader); }
for (int i = 0; i < overflowItems; i++) { GetItemData(worldReader); }
return chest; }
Each chest can contain one or more items up to the max item. For each chest, the items are parsed in a separate method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Item GetItemData(BinaryReader worldReader) { short stackSize = worldReader.ReadInt16();
voidMain() { World BadWorld = GetWorld(@"201709010835.wld"); Console.WriteLine( "{0} Empty Chests in Bad World out of {1}", BadWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), BadWorld.ChestCollection.Length);
13 Empty Chests in Good World out of 302 145 Empty Chests in Bad World out of 302
That seems to match what I was seeing. Although I was a little surprised there were empty chests in the good world. It turned out that the empty chests in the good world came from the Hell layer and one in my player home that I was not utilizing.
The next step is to try to compare the chests in to the two worlds to see if they can be matched. If they can, a repair is possible.
Comparing Chest Data
Unfortunately, there is no way to prevent having to iterate two lists in order to compare. However, the size of the lists can be reduced by retrieving the empty chests from the bad world. It can be reduced even further by excluding chests contained within player housing. I recommend doing this because these chests have been touched by player(s) which I was more comfortable doing by hand with TEdit. It probably would not have made any difference though:
voidMain() { World BadWorld = GetWorld(@"201709010835.wld"); Console.WriteLine( "{0} Empty Chests in Bad World out of {1}", BadWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), BadWorld.ChestCollection.Length);
Console.WriteLine(); Console.WriteLine("After Merge:"); Console.WriteLine( "{0} Empty Chests in Bad World out of {1}", BadWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), BadWorld.ChestCollection.Length); }
Then each Chest from the bad world is identified and the items it should contain are added from the matching Chest in the good world.
voidMergeEmptyChests( Chest[] sourceChests, Chest[] destinationChests) { foreach (Chest destinationChest in destinationChests) { foreach (Chest sourceChest in sourceChests) { if (destinationChest.X != sourceChest.X && destinationChest.Y != sourceChest.Y) { continue; }
int sourceChestItemLength = sourceChest.ItemCollection.Length;
for (int i = 0; i < sourceChestItemLength; i++) { destinationChest.ItemCollection[i] = sourceChest.ItemCollection[i]; } } } }
The only way to identify chests in the world is based on their location. Name possibly could be used - except named chests likely mean that the player has placed these chests intentionally. As previously stated, any chests that I had placed I wanted to do manually as I needed to compare with the chests I had placed items in after I noticed the save file corruption.
After running the application I get the following output:
13 Empty Chests in Good World out of 302 145 Empty Chests in Bad World out of 302
After Merge: 13 Empty Chests in Bad World out of 302
Perfect, I was able to restore all of the wild chests back to their generated state. The last step in the process is saving the changes.
Saving Chest Data
Everything up to this point had been relatively easy, so it was only a matter of time before I came across an issue. Unfortunately, the issue did not appear until I got to the final step of saving the merged Chest data.
The Issue
Remember how Chest data is read? Or more specifically, how the item data is read?
Chest GetChestData( BinaryReader worldReader, int itemsPerChest, int overflowItems) { Chest chest = new Chest(itemsPerChest) { X = worldReader.ReadInt32(), Y = worldReader.ReadInt32(), Name = worldReader.ReadString(), };
for (int i = 0; i < itemsPerChest; i++) { chest.ItemCollection[i] = GetItemData(worldReader); }
for (int i = 0; i < overflowItems; i++) { GetItemData(worldReader); }
return chest; }
Item GetItemData(BinaryReader worldReader) { short stackSize = worldReader.ReadInt16();
The issue is that the Chest data has grown after the merge since items have been added to a Chest that require data representations for a stack size, an id, and (potentially?) a prefix. This made my plan to just overwrite the data in the Chest data section impossible because it overwrites the data in the next section and consequentially causes the section locations to be incorrect. I was able to come up with only two ways to solve this:
The first solution was to add support for parsing the entire WorldFile save into the object that is being manipulated. This would prevent any data from getting overwritten but would still require the section data to be updated. Either through mathematically calculating the size of each section given the size of each representation of the data in those sections, or by updating the section positions after each section was written.
While this is probably a more robust approach in the long run, it would require a lot of effort on my part to build all of the representations for the data in each section and the corresponding code to read it.
Partition WorldFile
The second solution would segment the file into ~5 partitions. Three partitions for the data that is not changing, and two partitions for the data that needs updated (section locations and Chest data). In this way the data is accounted for without actually defining a representation for each piece of the data like the previous solution would have had to. Keep in mind that this solution is only worthwhile because the data changes that were made made were local to the Chest data section. If this were not the case, 2n+1 (where n is the number of sections) would need to be updated:
Partition Name
Partition Type
MetaData
Original
Section Location
Modified
2x Sections
Original
Chest Data
Modified
Remaining Sections
Original
Solution Implementation
I chose the second solution as it seemed like the easier of the two for what I was trying to accomplish. If this wasn’t a one-off project, I would suggest the first solution. To implement the second solution, all of the data from the file needs to be read. Fortunately, the data that I am not interested in can be read into a byte array. Then during the save process, the original data can be preserved by writing the contents of these byte arrays. The GetWorld function then gets updated to look like this:
Now that I had all of the file data in a mechanism that prevents data loss, I could write the function to save the world file and call it from Main after the Merge.
voidMain() { World GoodWorld = GetWorld(@"201708310826.wld"); Console.WriteLine( "{0} Empty Chests in Good World out of {1}", GoodWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), GoodWorld.ChestCollection.Length);
World BadWorld = GetWorld(@"201709010835.wld"); Console.WriteLine( "{0} Empty Chests in Bad World out of {1}", BadWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), BadWorld.ChestCollection.Length);
Console.WriteLine(); Console.WriteLine("After Merge:"); Console.WriteLine( "{0} Empty Chests in Bad World out of {1}", BadWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), BadWorld.ChestCollection.Length);
voidSaveWorld(World world, string worldFileSave) { using (BinaryWriter binaryWriter = new BinaryWriter( File.Create(worldFileSave))) { SaveWorld(binaryWriter, world); } }
Just like before, this is just a utility method around the SaveWorld method that actually does all the work. The worldFileSave property is the location to save the data, I recommend making it different than the original file to prevent further data loss!
This method is the exact opposite of the GetWorld method with a few extra modifications. The sectionPosition is used at the end to update the section offsets after all the data has been written since the size of the data section will be known at that point. The newChestDataSectionPosition is the local variable used to update the current section data to the new updated values. The writer then moves back to the section offset location stored in sectionPosition and writes the new section data. Each write method defined in this method is the inverse of the corresponding get method. The BinaryWriter will try to pad some of the values which is why the data is converted to the proper type before writing. This could probably be corrected by changing the underlying value in the corresponding object to the exact value needed:
1 2 3 4 5 6 7
voidWriteWorldMetaData(BinaryWriter worldWriter, World world) { worldWriter.Write((Int32)world.Version); worldWriter.Write((Int64)world.TypeCheck); worldWriter.Write((Int32)world.Revision); worldWriter.Write((Int64)world.UnknownMetaData); }
No scary code here.
1 2 3 4 5 6 7 8 9
voidWriteWorldSections(BinaryWriter worldWriter, World world) { worldWriter.Write((Int16)world.SectionCount);
The value of 0 needs to be explicitly cast down to Int16 as it is by default an Int32. This was causing my file to be larger than it needed to be at first and prevented it from being loaded.
Results
Once the file has been written to disk, it can be loaded in TEdit or Terraria to see if it can be parsed.
Alternate Solutions
Here are a couple alternative approaches to the one outlined above.
TEdit
It would have been faster to reference the TEdit executable and reuse the code they have written to parse the world file data for each instance of the save. This would have been like the first solution except I would be relying on TEdit’s implementation of it instead of rolling my own. The only code that would have needed to be written is MergeEmptyChests.
Terraria
Similar to the previous solution, it may be possible to reference the Terraria executable and reuse the code to load worlds. This would have made the first solution redundant and in hindsight is probably what I should have done in the first place. The save functionality probably does not have a Validate method check like TEdit does though.
Summary
In the end, was it worth it?
Probably not. Honestly, it would have been faster to go through the sand pit again than to research and code a solution like this. Granted, at the time I had no idea I had lost anything. At least I was able to recover from my mistake and learn something in the process.
voidMain() { World GoodWorld = GetWorld(@"201708310826.wld"); Console.WriteLine( "{0} Empty Chests in Good World out of {1}", GoodWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), GoodWorld.ChestCollection.Length);
World BadWorld = GetWorld(@"201709010835.wld"); Console.WriteLine( "{0} Empty Chests in Bad World out of {1}", BadWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), BadWorld.ChestCollection.Length);
Console.WriteLine(); Console.WriteLine("After Merge:"); Console.WriteLine( "{0} Empty Chests in Bad World out of {1}", BadWorld.ChestCollection .Where(chest => chest.ItemCollection .All(item => item == null)) .Count(), BadWorld.ChestCollection.Length);
for (int i = 0; i < world.TotalChests; i++) { world.ChestCollection[i] = GetChestData( worldReader, itemsPerChest, overflowItems); } }
Chest GetChestData( BinaryReader worldReader, int itemsPerChest, int overflowItems) { Chest chest = new Chest(itemsPerChest) { X = worldReader.ReadInt32(), Y = worldReader.ReadInt32(), Name = worldReader.ReadString(), };
for (int i = 0; i < itemsPerChest; i++) { chest.ItemCollection[i] = GetItemData(worldReader); }
for (int i = 0; i < overflowItems; i++) { GetItemData(worldReader); }
return chest; }
Item GetItemData(BinaryReader worldReader) { short stackSize = worldReader.ReadInt16();