Automating Validation and Update of the NHD

Video Transcript
Download Video
Right-click and save to download

Detailed Description

Presenter: Ethan Shavers, USGS National Geospatial Program Center for Excellence in GIS (CEGIS)

Topic: Automating Validation and Update of the NHD

Researchers with the Center of Excellence for Geospatial Information Science (CEGIS) are developing and testing methods for automated validation and update of the NHD using 3DEP data. An overview of this research will be presented, including artificial neural network strategies and geomorphic analysis.

Links:

Sinuosity and Richardson plot referenced article Richardson, LF., 1961, The Problem of Contiguity: An Appendix to the Statistics of Deadly Quarrels. General Systems Yearbook 6: 139-187.

Details

Date Taken:

Length: 01:01:27

Location Taken: US

Transcript

Thanks I'll yeah. My name is Ethan Shavers. I've been with siege is coming up on three years. No just over three years at the Holidays. I started with the group as a Mendenhall post Doc and then transitioned into a researcher and supervisor within the group. Really honored to be talking with this group. I've heard good things about these calls and I'll speaks highly of the type of discussion that happens here. So thank you for having me. I just to check you can see my screen now dot dot PDF. Yeah, looks good great. OK, so I'm going to be talking about work that we're doing with in sieges. Sieges is a research group inside of the National Geospatial Program. As Al mentioned, an our work is all on. Looking at advanced geospatial methods and their applications to cartography, and so I'm going to be talking about work related to future. The goal is looking at automated update invalid validation of the NHT, especially in light of the nhis datasets that are being collected now, including Lidar ifsar. My navigation sort out here. 1st to mention some of our collaborators. This is only some of them. Certainly we have a lot of people working with. Educational institutions, other people within state and federal government an actually internationally as well, but these are some of our primary collaborators and I'll be some of these names will come up again throughout the presentation. To begin, we're looking at. Updating and validating NHD features. So surface water features one and using that. With that goal in mind, strategies that we're employing or remote sensing an modeling using data that will potentially be available for the whole country in the near future for the foreseeable future future, I should say. In a common way to drive hydrographic features. Is flow accumulation model modeling? It's a standard way to get flow networks, as I'm sure most people in this call or aware, but there are a lot of challenges related to that. So things like inaccuracy, elevation models, roads can cause obstructions, and then the need for thresholds to determine density. And of course these things need to be validated and edited manually. And so with that I mentioned thresholding and and in accuracies in elevation model modeling. These things or elevation data, these things can especially have large effects on head water features, so this is an example. Had when we talk about head water features. This is the initiation of flow, so the very upstream extent of a network. 1st Order weekends we also called low order streams or or first order streams and as you can see in these pictures, this is. This is from our study area in the Midwest in Missouri. These are mutable features. They change alot. The location of that stream head can change a lot of single large storm event could change this and they're also very small features so they can be hard to identify from remote sensing data. And so you take that and then you add in the difficulties that can be related to flow accumulation modeling. These are really the hardest places to model and to validate for hydrographic features as and you know this is a problem that others have recognized, so you know most of you are probably aware of these these challenges. And so we have the challenge. The challenge of modeling validating these fiddis head water features. Well, also we have the challenge that there are a lot of these. This is a. A graphic showing the ratio of 1 one to 24K, first order streams, percentages, length of of total length and what this shows is that at the 124 scale for the NHD rating 52%. So roughly 52% greater than 50% are first order features, and it is understood that the energy is under represented as far as head water streams go. So this is there's a massive amount of features that are going to be very hard for us to model. Reliably update, invalidate reliably, and that's that's where the most of our work is. Focused is on identifying these head water stream features. And on top of that, areas that offer extra challenges or low relief agricultural areas where surface flow is hard to to model and then also forested areas, especially with steep slopes. Some of the strategies I'm going to be talking about today. You know I mentioned I've only been with sieges for a few years, so all of this is is ongoing research, relatively new research projects, so none of this is operational yet. We are tackling with for things that we are getting good results for. We're moving forward and looking at how to expand those things and and move them into an automated environment. So this is all still research in progress just to get that out of the way. But for stream identification, some of the things I'll talk about or using cross sections, channel cross sections and taking metrics from those. Some machine learning methods that we're looking at and then also some stream characterization methods. We're looking at to potentially add value add. Extra attributes to the NHD and be able to preserve those attributes at different scales. So cross sections is a Willie to classic geomorphic analysis. Method is taking cross sections of stream channels and and. All sorts of geomorphic features certainly, and also a data reduction method. We're looking at analyzing elevation data. Another raster datasets, pixel data sets, you know, reflectance data, satellite imagery at fine scales 1 meter and sometimes submeter scales, and we're hoping to look at these for the whole country, so ways to reduce the load for the processing definitely are beneficial. And so they you know there's dual benefits there. The strategy that we we decided to go for here is to over extract flow lines were looking for methods that don't require user input. Automated completely hands off is the goal, so over extract flow lines up to. Catch my boundaries. Draw profiles across those flow lines extracted from the deep with Dem values. As you can see in this plot. So elevation values as we go across a cross section. And so this is a stream for one of our North Carolina study areas, and then we're looking at those both in raw form and trying different smoothing methods on these profiles, and then run once we have those. We test different, have tested different statistics on these cross sections, so some of the things that we tested are looking at curvature, standard deviation of curvature or coefficient of variation slope. You know Max lo Mein slope at area under the curve area of channels which we've tested a lot of different things here and these are some of the things people other people have looked at. These things. Curvature as a common metrics that's used for stream. Channel delineation so this isn't necessarily an original approach, although this is the first use of it in cross sections. And not the first use of curvature, but some of the first use of some of these other metrics metrics such as standard deviation of curvature, coefficient of variation, and so we here have that sample plot. I hope you can see my cursor elevation across a profile here. I'm going back with between the terms profile and cross section and this is an example of what the curvature signal would look like as you move along, so the curvature is calculated by using looking at a point on that line and then the points around it and basically taking the radius there and with as the radius gets smaller of the curve in the line the curvature increases. This is one of our test areas in Panther Creek. Is the catchment and that's just West of Des Moines so very flat. Not a lot of relief and told pretty much all agricultural. And these are some head water streams and we drew profiles of claw across these and started looking at these different elevation metrics, elevation derived metrics and the one that jumped out at us. A standard deviation of. Curvature, so that is a single value get for each cross section. So you calculate the standard deviation of all the curvature values as you move across that profile. So then you end up with a single value for each profile. So as you move down. This dream you would get a single center division of curvature value for each of these profiles and then you get to a point where the channel initiates and what we see is a jump and mean and variance in standard deviation of curvature. And we saw this was pretty stable signal in this area. You know this is one of those we mentioned. These low relief areas where drain tiles or are ubiquitous or very hard to model. And so that it looked like a stable metric in this area. So we've been building on this. So we have a change point here where we go from lo Mein and low variance, standard deviation of curvature values too. And we get a jump and so it just so happens. Changepoint detection is a. Old problem that people have been looking at for several different disciplines, seismology, electronics, acoustics, economics, all anything where you have a signal processing change. Point detection is a really important tool and there's different ways to do it. There's lots of different ways to do it, of course, because people have been looking at it for a long time, but and there can be different methods for variance and different others for a change in meaning, and some are able to detect both. And so we have some automated tools, some that are our packages and some Python packages that others have built. We've also done some of our own in Python. These will let me super sure, yeah, so let's. So the top one you know we have this stream. Here we have the move across. Down stream you get to a point. You gotta change. This is cross section count and you could compare this to where you're moving along with signal and then you get a jump in variance or jumping mean. I'm gonna jump to North Carolina now or early area or early work was in Iowa, 'cause that's you know that. About his challenge challenging as you can get for flow modeling. But then we were made aware and given access to a great set of fuel validated streambed points from North Carolina. The North Carolina Department of Environment and Natural Resources provided this data, and so we've been looking at several catchments in this area, and these are feel validated single single visit but using benthic species and geomorphic sign sign. You know, Bedform Formation, bank formation, stream cutting. And identified as intermittent perennial or ephemeral an mostly, but will be looking at his intermittent intermittent stream heads. And so we have four study areas that will be talking about one in this low relief Tidewater region. The Sandy Creek Study area, two in the Piedmont, which is intermediate between the mountains and that tide water, and an easier to model just kind of, you know, gentler slopes and then the mountainous Asheville area. This is the Roman area, so in the within the peadmont at region. This is the catchment, and here are the the existing NHD lines for that area in blue on the left and then the field validated stream head points and these pretty sure these are intermittent. There may be one perennial and in a few ephemeral stream head points here. And then so, as I mentioned, we go, we start by generating flow accumulation lines. Do you 1000 cell skeleton is the threshold we use, so 1000 accumulated accumulation cells is where we initiate the flow lines and that we found that reliably to take us out to the catchment boundaries pretty much as you can see. And then we looked at, you know, what was the flow accumulation values for all the stream heads we had, we, we found reliably, you know, if you have a flow accumulation of 450K as a starting point, all the stream beds will be above there. So basically moving down channel starting from 1000 where we initiate once you move down channel and you get to wear 450,000 cells are contributing to that Channel. We took from that point upstream and use that as a starting point for for our changepoint detection. Assuming that we would within that within that 1000 to 450,000 range we would see a a stream head. So the change point detection results for this this. So this is a plot with all the, so we ended up doing 3 meter spacing with these cross sections 40 meter Y to believe so very dense crosses in. But this is high resolution QL. One lidar derived elevation. So that 3 meter which we can still see some variation as you move down the channel, and these are the standard TV. Standard deviation of curvature values. Here you can see the legend at the bottom, they increase as the colors get cooler. And how they align with channels in here. The changepoint methods we had good results but automating that and you know being able to automate the determination of which were the change points of value was very challenging when we look at the plants of you know it'll find two or three change points within a channel and we see that it's getting the stream head but it's getting other things too and then determining which one was the important change point and we have not. Resolve that problem, big issues were Rd intersections, so culverts. Water bodies and also the brothers abroad flood floodplain down the middle of this one. And anytime the channels gotta near there, we've got very significant change points. So there is work to be done. Yet on the change point detection here. But we did look at data distribution within channels modeling from this Rowan area and founded a threshold SK van. Standard deviation of curvature value that. Did a good job of pruning back, so if we identified a certain density of. Cross sections with the standard deviation of curvature over this threshold value that's determined from the spread, so the the range of the values instead of the number of the values like mean that we were able to get very good results. And this so just so you can see all value all channels are cross sections rather that were above the threshold and then you know one once they reach a certain density we cut off everything upstream and this is what the results look like. Which stream heads imposed in circles here? And this this is important that we're not using the. Number of cross sections like taking averages or anything like that because of density variations that we see. As you know some regions have high density channels and other have some. Others have low, so we want to threshold away to look at the data unique to an area without depending depending on a certain counts or or a certain number of channels. And this is we've been able to test this because there is the density changes. Did density of these catchments that we're looking at decreases as you move East? But this is zooming in on some of the results from the other area in the Asheville area. What we see so is the stream. Heads are identified and these are extracted channels and dashed lines and then the over extracted channels or hard to see. But there are grey you can see especially down here and then small circles are the stream heads. The field validated stream ads and So what we saw is that there were a lot of channels in the Asheville area that did not have identified streambeds. So these are ephemeral channels. Likely they just weren't. Uh, and they weren't mapped in the data set, so this really these hurt our scores. Also in so this is the high relief. This is all forested catchment. And then we have the Sandy Creek area in the West in the coastal plain. As you can see, very flat and there are a lot of drainage canals out there, and so we had issues here with these lines being mapped by our method but not being identified in the stream head data set. So you can see here is the stream head here in here, but yet there still channels going upstream so. Yeah, some places are our results. Look better and some places they don't like in the Gaston area. Here you can see on some channels. We were able to get past these water bodies and and really align well with the field validated data and then some other other other areas. We just we Lossed lost track of that signal. These are all channels, so these plots are kind of. They can be hard to wrap your brain around, so this is the bottom one will start start here. These are the results for all the stream heads and so field validated. These are the ones provided from the state agency in North Carolina compared to extracted. So extracted us to the point found by the cross section tool and so this is the distance between these two points. The distance between the field validated stream head. Any extracted head or these white columns and then the shaded columns are the extracted so identified by our tool versus apparent where we did head up heads up identification using elevation. An optical data. Just trying to do some other method of identifying the stream. Channels are stream heads rather. And so this is the difference between those points. We can see most of 'em fall within 50 meters, 200 meters of the the either feel validated or heads up identified stream heads, and there are certainly outliers as well. Uh, but generally good results and we compare this to the Peltier method, which is probably the closest thing to an automated stream network extraction tool. It does require input threshold, so it uses a raster curvature thresholding. Add two determined where channels could be and so you have to set up a curvature threshold value, and we found that you definitely needed to do it amongst these study areas, but even using that setting the optimal threshold, our results are very similar to as far as accuracy to the Peltier method, and that's a documented in a manuscript that is available. Miss Sandy Creek had to probably the worst results in a lot of the outliers where he ran. It was because of those. So a lot of over extraction is what this means are two identified channels way upstream. So up to you no more than 400 meters upstream of where the the map field headstream heads work, and those were generally the canals. This is a paper that came out last year. That documents all that work. Now I'm going to talk about some very different work and this is, you know, I showed our collaborators. This has the group from the University of Illinois, Champaign. Urbana has been integral in this workshop, wins Lab there and Larry Stanislawski, my collaborator. They've all been a big part of this work. Larry may be on the line actually, so we're looking at using artificial neural networks. To identify streams to build hydrographic networks. So our approach we need training data, how do you know how we're going to train data for training these models for unique areas all across the country? So we thought look at existing vector networks, so in this case look at streams. And NHD streams rather an, then do flow accumulation modeling on area and where those two networks match up. Use those match lines as training data. I'll show some of that coming up and also available Rd vectors from the national map as well. Use those inputs and then potentially we can train using those vectors, identify intersections, an breach roads were necessary to drive a good line set that could be used to validate NHD. Features. These are the study areas this, so we've. Uh. Gotten access to some really nice contractor. Derived elevation hydrography data for several areas, so these are the areas now. Several areas in Alaska and then North Carolina. We have our field validated stream had data data and that contractor data in Virginia. So these are the areas that we're looking at. Of course, our original study areas. In Iowa. And how we go at this? You know those of you familiar with artificial neural networks probably would have a lot of questions. I'm just going to kind of cover some of the basic tuning methods used here. There's a lot of you know the the artificial neural network, you know, the algorithms. The Ascential model is something that's built by labs. Generate often for, well, image extractor, image classification, or feature extraction is usually. I want to going after often identifying, you know, cancer cells or things like that, and see T images. Mixing a drink there sounds like. Using these models and tuning them to try to extract hydrographic features and rose. So as I mentioned, extract dense elevation derived drainage lines. The tools that we are looking at open source tools, all of them geonet, entire DM, or the ones that we were testing most often match derived lines within easy flow lines. And where the use those matching lines to train the models and then using an existing Rd vectors as well. So this here is in that Panther Creek Low relief Area agricultural Area in Iowa. The image itself is the total Graphic position index, green lines to catchment boundary and these are the matched lines for this small segment here. Sampling strategy, so how do we? How do we train the model? We randomly select pixels from the the three feature categories, so that's stream Rd and then non stream that you know areas that are not stream or Rd. So randomly select pixels, build windows around these windows. Different window sizes have been tested. I think the models are working with now. It's a 224 by 224 pixel window. And so we have those buildings windows around the pixels and then do augmentation so distorting, you know, skewing the the window or rotating it, randomly mirroring it. And with that we end up with hundreds and often thousands of training pics of training seeds. A bit about the software and hardware involved. Currently we're testing two different types of artificial neural networks. Pretty standard 2D convolutional neural network where there's just convolution layers and pooling layers and then also looking at the unit which is a newer model that people have probably heard of. Its being used in a lot of domains and it has a convolution layers and then has a. Upsampling layers as well on the back end of it, and also some other. Modifications that I'll mention in a minute. Software, again all open source tools, Python, Tensorflow, Keras using firm. So all with a Linux environment. The hardware. This is the. High performance computer in that we have access to in sieges. Input for these neural networks so. The model itself takes in layers and we are using up to 14 layers and so these are all aligned raster uniform, raster Grid, 2 dimensional, essentially raster grid layers. Some of the things that we're using or Geo more Fonz, open this positive and negative openness or geomorphic due more funds. I'm sure a lot of you aware of identifies pixels as belonging to one of. 9 or 10. Generally, standard geomorphic features now shoulder put slow polo open. This is the mean angle in nature and it can be above the horizon or below the horizon, so they call that positive or negative openness. As you've seen, topographic position Index is something we use. Which is the difference between a pixel and the surrounding the elevation of the surrounding pixels? And we use different window sizes. Also, Lidar point cloud derivatives. We use things like return intensity, return point density. We use certain elevation density, so just looking at canopy points or just looking at ground points. Topographic wetness index curve surface curvature, so these are all raster layers. Since we stack up and feed in and then you know that training data looks at each one of those raster layers aligned with that single training window. This is the F1 score is what we use to. Report our results. Are accuracy results. And so there's a lot of different ways that say you could say you know present percent correctly identified pixels is a way that you could report the accuracy of a model. The challenge is that there are. If you look at the percent of stream pixels in one of these scenes, is very low. So there's way more non streamed on Rd pixels. Then there are the other two categories. So you need a way to balance that and that's the F1 score is. It is something that's often used in these kind of pixel. Why is classification? Tests and so this is just it's a function of precision and recall. As you can see here. Take note that recall, so that's a portion of reference features, correctly labeled. I'm going to refer back to that, so the false negative. Adding in that policy negative can reduce higher false negatives. Reduce recall. These are results. Early results from some of our first Test. This is just a 2D. CNN tests in Panther Creek, so that's Iowa. Some of our best results for looking at testing. So not including training data. So you train on an area and then you test on another area using different layer combinations. So here we can see using TPI and open this we were able to get 62 percent F1 score. Looking at the roads getting 70%. And I'll show you what 70% F 1 looks like here in a bit. Here right away. So here we have the reference data, so these are Rd features. An heads up identified Chance Stream channel features, so this is what we are. Our best guess of what is actually on the ground. We don't use this to train, this is just what we used to to score our results and this is what the best solution for stream channels looks like for this area so you can see these are pretty good. They look better than that 60%. Or that sixty F1 score with would tell you because there are a lot of. Falsely identified features out here, small features. A lot of those tend to be abandoned channels that may be something you would add it. Add into the HD, but you know we have to, you know, depend on our metrics. So the F1 score consoles is wrong, but these are a lot of things we believe that can be resolved in post processing, so these are encouraging results for roads and streams as well. Connection is something so weird. Driving connected networks is something that we deal with a lot here too. Jumping to the Covington Creek area in Northern Virginia, we have contractor elevation hydrography data here and so that's this. On the left, and here are some test results, so this is using the unit model now, and so this is no attention attention as the attention network is adding in an attention layer in the upsampling side of the unit, and that basically. Tells the model to focus on areas where there are positive features. It's as an optimization. Function that's been added in and it's in. It does seem. We do seem to get better results using that often, so these are looking at so we get F1 scores here. 67 and 74. You can see that precision is generally higher than a recall for these, so once you start adding in false negatives, so areas that are actually should be identified by the model but aren't. As stream or Rd. Tends to bring down those scores, so here training so training on this 5 by 5 kilometre area. And then testing on the full watershed. And this is down here. Training on the Wall Shipful watershed in testing and look for watershed, so lot of different. Ways that we're looking at this. This is looking at the road networks for that same area. And we get F1 scores. So these are both looking at training and testing on the same area, but it gives an idea of the type of how well we can. We can capture that Rd density and pattern. Again, these are training. These are our validation data set on the left and then the derived features on the right. If we zoom in on that small area on the right, the red box we see with these results look like. Here we are looking at the TPI, so topographic position index with a 21 cell window. And what we see here is that we are identifying most of these features accurately, but really good results. We found that were getting water bodies pretty reliably. Some areas where we we miss here you can see these. There's very faint. Pink Line is a Rd segment, so this would lower our score, but it you know if you look here there may not actually be a Rd there, so this could be in accuracies in the vector datasets. Then we look at least a fainter green lines. You can see best here already contractor flowlines an if you look where that's running a very flat. Problem. Probably channel and and very shallow, probably as well, so something that's hard for the model to see, but otherwise pretty good results. Alright, I'm gonna change gears again. Sticking with the attention or the artificial neural networks. Still looking at the unit model, but we have recently. You know, we're we're still having issues. Identifying call Hertz and Rd channel intersections and it's really a hindrance to the other modeling that we're doing because, you know, as many of you probably know that the flow lines are really valuable for driving these networks. And they're a big issue. Culverts are a big issue and so we're looking at can we just use these models to identify culverts? And this has been facilitated again by the good folks at North Carolina Department of Natural. Environment and natural resources. They have provided us with a statewide culvert data set, so this is along state managed rose, so it's not all the column culverts in the whole state. You know there's a lot of 'em on private land or something that might have been missed that aren't identified, but it's a really good place to start. So we've been going in here and where this is early work, we really just started. Tuning models about a month ago. But and wish it were still cleaning up the data. We have to go in there and you know 'cause the models are really only as good as your training data. So and I'll show some examples of that, but this is too. PAC 12 in this kind of. Central western North Carolina area. About 100 square 60 square kilometres. That were we've pulled out. Start testing this. And this is an example of what we're seeing. the F1 scores were getting so far. The best ones are in the 60s. Mostly they are in the 40s. The good ones are in the 40s and 50s, so we're not getting good F1 scores doing this yet. Part of that is we're still finding culverts that we had identified. So this is an example where it was trained and tested on the same region and so the training features are in blue and you can see. Did. Predicted training feature was predicted, but also it showed us an area we had missed one, so this lowers our F1 score. And also the other big challenge here. Probably the biggest challenge for like. Gauging the quality of our results, I think are the the the the very nonstandard shapes. You know these there's no normal culvert shape. Or a culvert Geo morphology? I mean there is general things you know a higher, higher elevation linear feature intersecting a lower elevation linear feature, but they can look very different and those different elevation features can look different so. You know, identifying what is a covert, how big is it covered? What is the boundary of a culvert? You know, that's something that's giving us trouble. As far as predicting or or gauging our results. This is a screenshot I took today so we we have the two hooks and we've we've split. We split them up in North South sections in the East West sections and looked at different arrangements using different, you know will train on the western half of each of whom test on the eastern half trained on the southern. You know all the different arrangements just to test the different. The different training and testing features, and that's important because most of the urban area is in the eastern half of. The Northern Hawks. We want to see how you know training or testing on that area affects the results. That's the most challenging areas for all this coverage go as you can imagine, just because there's not always an elevation in apparent elevation feature related to the presence of drainage in urban areas. But this is so we here we this is the results for the eastern half of the Northern Hawk, where the urban area this is actually. Just North of the urban area, and so this was tested on are trained on the western half, tested on the eastern half and you can see we're getting good results with most of the culvert. Features were identified. Of course we have over extracted and you know that's not an issue if you thought about how we would use these results. Ideally we would run this model and then burn the elevation model down to a lowest vicinity point within. The identified, called Culvert region, so you could see these larger and more amorphis. You know Blobby results being actually useful for burning. But they're not good for our scores, but then also. We see areas where you know it just totally misses. It's hard to imagine how some of these other covers could be identified and then this one in the middle missed, but that's the way these things work, so there's a lot we don't understand. Still, we're still tuning what layers to add into this. We kind of just took that stream network unit model and started feeding in culvert training data. So that covered features themselves, so we're still working on tuning Windows 10 window size input feature layers. There's alot of things that we can we can do to hopefully improve and get some good scores, and we're also looking at different ways to gauge that so things that we're looking at for model optimization. Testing other deep learning models, so we're going to be running these tests using the 2D CNN, which we you know we've seen. We can get good results with when we ran it on flow line or on streams and roads, the unit is certainly faster, but with the computing power available to the USGS. And we are not so worried about speed. Right now. We just want to see what gives the best results. So testing other deep learning models, testing different feature shapes for the training data so you can see using different window sizes using polygons are different things. We're trying. As I mentioned, adding in layers testing the effectiveness of pulling out individual layers and how that affects the score sort of results adding in water bodies. Dams are kind of a. A middle ground, you know they're not exactly a culvert. It's not the intersection of a road or a linear feature like a roadway. We would normally expect to see a culvert in the landscape. But there's other aspects of them that are very similar to culvert, so we're looking at, you know, adding in dams or not using them, and then the hot water. You know, waterbodies. Effective results here. And then also the effects of spatial and input data, spatial resolution resolution. Right now we're working at 1 meter. OK, now I'm really going to change gears and don't have a ton of time, so I'm probably going to move quickly through this, which is going to be a challenge 'cause this is very different concept. But so sinuosity. Everybody is familiar with Cinema City and beyond during streams. It's a standard feature of a stream. It's a single metric. It's an attribute of all streams, but it's also indicator of the environment and the landscape. Cinema City is a function of Jim Morphology, land surface type settlement. You know supply surrounding sediment types uplift tectonic setting. All of these things can impact sinuosity or meandering of the streams, so. It seems like an important attribute of of a channel for analysis. For change detection an it's something that we would like to be able to preserve in multiscale representation. So we're looking at ways to identify the sinuosity or the bend geometry of features. Classically, sinuosity. If you look up the definition of Cinema City in the Glossary of Geology, for instance, it's total feature length or straight line distance, and this is these are hypothetical streams that would have the same sinuosity. It tells you nothing about the actual geometry, it's only the deratio of end in length over total feature length. So in reality the sinuosity as is classically understood or usually understood tells you very little about stream geometry. And so we are looking at some ways to to get more detail to describe better S stream feature or line feature in general geometry. One thing one. Other metric that people use to look at, you know, line geometry is fractal. Linearity, fractal dimension. So this is the Richardson plot. The reference articles at the bottom here, but this is the line length or so, so the line length as measured by certain step sizes. So you can see this Elwood for this one would be the length as you measured across these steps, and this would be your step size on the X axis and that line. And so total length would be at the beginning here 'cause you would assume a very small step size. So as those step sizes get bigger or your ruler gets longer, your feature length decreases. As you begin to cut off Benz. And so this is that Richardson Plot originally used to look at the borders of continents, and so this the slope of the Richardson plot is equal to 1 minus the fractal dimension or fractal dimension is equal to 1 minus this slow. And so that's the relation between fractal dimension in the Richardson Plot, but it gives you one value for that feature. And So what we're looking at is. 2 using the scale specific slope. So look at the slope as you move across the feature and we're calling that is, S3 is the scale. Spin, spin scale, specific sinuosity? Yes, that would be hard to say five times fast. And so this is a plot. You have the Richardson plot and then S3 plot of that feature here. And you can see how that slope changes as you move along. That and if we take that interpolate that, you can start to see that you can identify feature lengths that contribute to the total fractal dimension of that feature. Uh. So I'm going to skip ahead to try to leave a little bit. For questions, does that sound OK Al or do you guys go over sometimes? How do you? How should I? Well, we can go over. We are recording this so if people miss the last few minutes like they could catch up. So yeah, if you wanna try to. Cover a little bit more you could, yeah. OK, thanks sure, so as an example we can look at a few features here. These are large streams from different ecoregions from across the country and you can see how those S3 plots vary as the geometry the bend geometry gets larger. So down here we're looking at 1010 thousand meters here. So 10 kilometers were getting really getting into big region scale bends and features. These are some examples. As you move across these features here you can see where the plot is skewed to the left. We have these larger features. Ann are smaller features, sorry, smaller geometry's and as that Pete moves to the right, we can see the geometry of the features gets larger. And so we've done some tests on this using a bunch of different simulated features with different bend types and what we see is that. The. The peak S3 value that we get the peak the that that peak location corresponds with the length from peak to trough Ben Geometry. So that's that's what we're actually identifying in this S tree plot is where that peak to trough distances dominant. What what, this, what lengths of peak to trough. Curves are we seeing in these features? And maybe I'll leave it here so this you can see. How this could potentially be used? Look at how that S3. So what we have is a single vector data vector. You know that that that line of S3 values that can describe the geometry of variable geometry, the different type of bend radii or meander sizes that contribute to the total sinuosity. And you could look at how that changes as you move along the future. And it could also be used potentially for change detection for a full feature. So I will stop there and leave some time for some questions. OK great, thanks using lots of lots of interesting stuff. If folks have questions you can either type those into the chat or you can try just unmute yourself in the participants panel, unmute your microphone and speak up here. We do have a lot of people on so it might be better to choose chat. Still have 170 people on. Anybody have questions? I know even covered a lot of a lot of ground. There were some chats about databases of culverts which. Yes, that that's. There are some state dotd's that have 'em and. I know I remember Arkansas has a state database of culverts that they've actually index 10HD so. If you're interested now, I could connect you with someone there, absolutely, yeah. Yeah, so Ethan do you have? Do you have a web page or anything? And one of the questions was you know where to learn more. So yeah. So I I mentioned that the cross section paper I should have mentioned that there are some conference papers an in a Journal article in the works for the. The artificial neural network. Models were working on and then. These are some. Journal papers for our conference papers that we have. We're working on a manuscript for Journal for the Sinuosity work, and there is also a cejas website, so you can get links to data and or links to projects and contact information there as well. So we can maybe post that in the chat. Yeah yeah, if you could put that link in the chat that would be good. I did see your question about Distributore distributary networks. Have you done anything looking at any of that? I'm not sure what that's referring to the sort of like a Delta situation. You know where you have. Instead of a bunch of conferences, but you know you actually have. Right, of course. It's like braided streams, maybe, or just not a single channel? Is that what you mean? So yeah, right? So we're working some of the places were working are up in the Kobuk region, so Northwest Alaska. And that's definitely something we see up there and we're getting good results up there. Probably better. Then some of the other areas because and there's less roads there. That's one thing, and it's mostly open bare ground. Not a lot of vegetation. But yeah, with the model is able to identify those regions as well. And that the Alaska work is what will work, we're writing up, so hopefully there will be a Journal article in action. Yeah, so there's one one question about. Ties if there are ties to this back to the generalization work that Larry had worked on quite a bit. Yeah, certainly so. S3 definitely has direct implications for that. 'cause we want to find the best way to preserve meander geometry as we change change scales. So as we start to generalize features, how do we preserve that? That that that valuable geomorphic characteristic that's represented in the meander geometry. So do we do that by attaching an S3 metric to that feature? Do we use it to evaluate how to actually represent how to effective representation of that feature? That's definitely an issue, generalization. There's also been some work looking at using artificial neural networks for generalization. That's something that. We haven't tried yet, but it's something we've we've we've seen out there and it is probably moving in that direction. OK, another question. What do you think about the potential to use these models in the built environment so where you have a lot of. You know urbanization or. Right? Yeah, of course that's a challenge. Big part of what we're doing is trying to get good flow lines and then use those to build networks supported by the artificial neural network results. So there is not a. A lot of application in the built environment we start, we talk about dense urban environments. But we are looking at that and seeing the effects in this this culvert work that's I think the only. That's about as dense as as we get. You know, we make a lot of assumptions about having pretty decent flow network. Glow Networks for urban areas. Like we know where the streams are in cities and things that we we tend to make those assumptions and also they they take up a lot less area across the country. So when you're talking bout dense environments, we're not looking at those now, but down the line, something that we see isn't as a as an Apple. Absolute necessities is integrating. Uh. Urban runoff in urban sewer systems into the NHD so it's something we're thinking about. Right, yeah, we've done a bit of experimenting. With that, I think in this. In this call series we we've had a couple of presentations about that in the last year too. Washington DC area one. Um, let's see another question about scales of TPI. And. Shape for the window. I'm not quite sure what the. So in there question yeah so scale of TPI so. TBI is actually a layer that we find to be very valuable for this and other people have used it as well for thresholding where channels could be, so TP eyes to graphic position index and it is the difference between a pixels elevation value and the mean of the surrounding elevation values within a window, and so the scale is the size of that window, and I think I showed somewhere there was like a 9 cell window or 20 cell window. And we use actually a couple of different. There's one of the models in Alaska where we're using a couple of different TPI scales. An it seems that it's it's it's most appropriate that the I mean it's controlled by the size of the feature you're looking for, so we're looking for a smaller features looking for channel so variation in elevation as you go across the channel or our bank single bank. So on the on the on the scale of meters you know several meters to 10s of meters is. Is the scale that we're looking at, especially in Alaska we have 5 metre elevation were getting up. Let me start to look at 2020 cell windows up there. K There was one question about the tie in of this to the LDH pilot work in Alaska, Anne in Texas. So yeah, you're certainly. That's that's what we see it. That's great data for us to to test these models, and so we've been looking at using that EDA. Each data as training data in one area and then testing on other areas and and you know it's it's invaluable, really. 'cause there's just having a good bearing or what the flow networks look like in, you know North Central Alaska. That's about as good as we can get, so that's really helpful for for testing these models. Again, we're looking forward to getting our hands on that data for Texas as well. Yeah, I probably should explain that the acronyms to kind of a new acronym that we're using E DH stands for Elevation derived hydro and that's the basic. Concept of you know deriving our hydro layer or hydro features from. High resolution elevation data like LIDAR or IFSAR. Um? I think you know we've got over 5 minutes, but I think it's been really valuable. I think we're probably. We can probably call it good for today. Really appreciate Ethan. Your presentation. Lots and lots of good stuff in here. And really good questions from the audience as well. So thank you again, Ethan and thanks everybody for attending.