Home Tags Forest

Tag: Forest

New trailer for Blade Runner 2049 is pretty dang alluring

This film looks like a genuinely novel approach to the world established by Ridley Scott.

On the heels of Zika comes its deadlier relative, yellow fever,...

Outbreaks in Brazil are raging close to urban areas that could spark another big issue.

The Amazon forest is the result of an 8,000-year experiment

Ancient peoples discovered agriculture by cultivating trees in the Amazon.

In the US, added wildfires due to carelessness, not climate change

Humans start fire when lightning wouldn't, making for a much longer risk season.

Op-ed: Mark Zuckerberg’s manifesto is a political trainwreck

He says that Facebook is developing AI to create a global democracy—kind of.

Big props to Bigfoot—new bill would name it “official cryptid” of...

"Sasquatch has made immeasurable contributions..."

Guiding ecosystem conservation using airborne lasers

It’s possible to get a sense of how forest species interact from aircraft.

Machine learning versus spam

Machine learning methods are often presented by developers of security solutions as a silver bullet, or a magic catch-all technology that will protect users from a huge range of threats.

But just how justified are these claims? Unless explanations are provided as to where and how exactly these technologies are used, these assertions appear to be little more than a marketing ploy. For many years, machine learning technology has been a working component of Kaspersky Lab’s security products, and our firm belief is that they must not be seen as a super technology capable of combating all threats. Yes, they are a highly effective protection tool, but just one tool among many. My colleague Alexey Malanov even made the point of writing an article on the Myths about machine learning in cybersecurity. At Kaspersky Lab, machine learning can be found in a number of different areas, especially when dealing with the interesting task of spam detection.

This particular task is in fact much more challenging than it appears to be at first glance.

A spam filter’s job is not only to detect and filter out all messages with undesired content but, more importantly, it has to ensure all legitimate messages are delivered to the recipient.
In other words, type I errors, or so-called false positives, need to be kept to a minimum. Another aspect that should not be forgotten is that the spam detection system needs to respond quickly.
It must work pretty much instantaneously; otherwise, it will hinder the normal exchange of email traffic. A graphic representation can be provided in a project management triangle, only in our case the three corners represent speed, absence of false positives, and the quality of spam detection; no compromise is possible on any of these three.
If we were to go to extremes, for example, spam could be filtered manually – this would provide 100% effectiveness, but minimal speed.
In another extreme case, very rigid rules could be imposed, so no email messages whatsoever would pass – the recipient would receive no spam and no legitimate messages. Yet another approach would be to filter out only known spam; in that case, some spam messages would still reach the recipient.

To find the right balance inside the triangle, we use machine learning technologies, part of which is an algorithm enabling the classifier to pass prompt and error-free verdicts for every email message. How is this algorithm built? Obviously, it requires data as input. However, before data is fed into the classifier, is must be cleansed of any ‘noise’, which is yet another problem that needs to be solved.

The greatest challenge about spam filtration is that different people may have different criteria for deciding which messages are valid, and which are spam. One user may see sales promotion messages as outright spam, while another may consider them potentially useful.

A message of this kind creates noise and thus complicates the process of building a quality machine learning algorithm. Using the language of statistics, there may be so-called outlier values in the dataset, i.e., values that are dramatically different from the rest of the data.

To address this problem, we implemented automatic outlier filtration, based on the Isolation Forest algorithm customized for this purpose. Naturally, this removes only some of the noise data, but has already made life much easier for our algorithms. After this, we obtain data that is practically ‘clean’.

The next task is to convert the data into a format that the classifier can understand, i.e., into a set of identifiers, or features.

Three of the main types of features used in our classifier are: Text features – fragments of text that often occur in spam messages.

After preprocessing, these can be used as fairly stable features. Expert features – features based on expert knowledge accumulated over many years in our databases.

They may be related to domains, the frequency of headers, etc. Raw features. Perhaps the most difficult to understand. We use parts of the message in their raw form to identify features that we have not yet factored in.

The message text is either transformed using word embedding or reduced to the Bag-of-Words model (i.e., formed into a multiset of words which does not account for grammar and word order), and then passed to the classifier, which autonomously identifies features. All these features and their combinations will help us in the final stage – the launch of the classifier. What we eventually want to see is a system that produces a minimum of false positives, works fast and achieves its principal aim – filtering out spam.

To do this, we build a complex of classifiers, and it is unique for each set of features.

For example, the best results for expert features were demonstrated by gradient boosting – the sequential building up of a composition of machine learning algorithms, in which each subsequent algorithm aims to compensate for the shortcomings of all previous algorithms. Unsurprisingly, boosting has demonstrated good results in solving a broad range of problems involving numerical and category features.

As a result, the verdicts of all classifiers are integrated, and the system produces a final verdict. Our technologies also take into account potential problems such as over-training, i.e., a situation when an algorithm works well with a training data sample, but is ineffective with a test sample.

To preclude this sort of problem from occurring, the parameters of classification algorithms are selected automatically, with the help of a Random Search algorithm. This is a general overview of how we use machine learning to combat spam.

To see how effective this method is, it is best to view the results of independent testing.

Feds say Chicago e-recycler faked tear-downs, then sent CRTs to Hong...

Enlarge / EnviroGreen's homepage. Just because a website has pictures of a lush forest doesn't mean it represents a company that does good things for the environment.EnviroGreen reader comments 23 Share this story According to an indictment filed in Chicago federal court (PDF) late last week, 45-year-old Brian Brundage cut some serious corners while running his e-recycling businesses. He was arrested on Monday on charges of income tax evasion, mail fraud, and wire fraud. Brundage is the former owner of Chicago-based Intercon Solutions and the current owner of EnviroGreen Processing, based in Gary, Indiana.

Both recycling companies purported to sell e-recycling services to companies and government organizations that needed to get rid of old electronics.

Brundage promised his clients that their old computers, TV monitors, and various other devices would be broken down into their component parts and recycled in keeping with federal guidelines. Instead, feds allege that Brundage shipped some of those electronics for illegal disposal in landfills overseas.

Those electronics included Cathode Ray Tubes (CRTs) from old computer and TV monitors, which contained “hazardous amounts of lead,” as well as batteries.

The electronics that weren't shipped to Asia were destroyed inappropriately on the premises of his businesses or stockpiled indefinitely in warehouses, which is forbidden by federal guidelines. According to the indictment, Brundage also improperly resold many of the electronics he acquired.

Between 2009 and 2015, Brundage received shipments of calculators from an unnamed technology company in Texas with instructions to disassemble the calculators and recycle them accordingly.

But Brundage apparently resold the calculators to another company based in Tampa, Florida, which purchased and sold used electronics. (The Chicago Tribune notes that one of Brundage’s clients was Texas Instruments, but the company didn't respond to Ars' request for comment on the matter.) In exchange for the shipments of calculators, Brundage allegedly had the company in Tampa directly pay some of Brundage’s personal expenses.

Those expense include between $31,000 and $39,000 per year for a nanny and $26,000 to $42,000 per year for a housekeeper, as well as tens of thousands of dollars for jewelry expenses and payments to an Indiana-based casino. Among the more colorful accusations in the US government’s indictment of Brundage: the businessman allegedly went to lengths to fool third-party auditors into giving his companies the certifications necessary to keep doing business as an e-recycler.

Brundage allegedly invited unknowing customers on sham tours of Intercon’s facility. Once there, he "directed Intercon's warehouse staff to set up a staged disassembly line to make it falsely appear as though Intercon regularly processed e-waste in a manner that was consistent with its public representations." The Chicago Tribune published a feature on Intercon in 2007.
In it, Brundage is quoted saying, “We put old products on a disassembly line. We break each item down to raw materials and send them off to be smelted and reused.” He added, “nothing that leaves here goes to a landfill.” The indictment against Brundage only reaches back as far as 2009, so it’s unclear whether Brundage’s statements in 2007 were actually the case or whether the Tribune had been duped.

Brundage has operated as an e-recycler since 2000 when he purchased Intercon Solutions from its previous owner. In fact, Brundage has faced accusations of improperly disposing of e-waste materials since 2011, when he applied for an e-Stewards Certification, a certification that says the recycler is held to high standards, through the Seattle-based Basel Action Network (BAN).

BAN is an environmental organization that fights toxic and electronic dumping.
Instead of simply certifying Intercon Solutions, BAN alleged that it found evidence that Intercon was shipping CRT monitors and batteries to Hong Kong.

Brundage denied the allegations.
In response, he sued BAN for defamation.

The case wound its way through the court and was dismissed by a Chicago federal judge in October 2015 (PDF). Last week’s indictment also accuses Brundage of shipping “large quantities of e-waste” to Hong Kong, adding that in May 2011, the Hong Kong Environmental Protection Department discovered a shipping container full of waste and sent the container back to the US.

The indictment alleges that after the May 2011 incident, Brundage destroyed business records pertaining to previous shipping agreements but continued to ship e-waste overseas, with fraudulent labels and shipping reports. Brundage allegedly took destruction of e-waste into his own hands, too. He allegedly smashed CRT glass “in outdoor areas, without taking measures to prevent the release of potentially hazardous material into the environment.” The US government says that Brundage earned “millions of dollars” from his illegal schemes.

The government is asking for a judgment requiring that he forfeit all property obtained “directly and indirectly” from the alleged dealings.

Would-be Internet weatherman star sets a wildfire to increase his viewership

Enlarge / Screengrab from Johnny Mullins' latest "weather update."Facebook reader comments 51 Share this story Johnny Mullins' Facebook page contains a mix of "Crooked Hillary" memes and "weather outlook" video updates.

For all of his talents, Mullins, alas, has attracted only 82 followers on the social network, and the wannabe Internet weatherman clearly felt like he deserved a wider audience.
So Mullins, who lives in Jenkins, Kentucky, took matters into his own hands earlier this month. According to Jenkins Police Chief James Stephens, the 21-year-old Mullins now faces a charge of second-degree arson for setting a fire in Letcher County. The chief told the Associated Press on Friday that Mullins said he started the fire he was charged with "because he enjoyed the attention he got from the Facebook stuff." During the last video Mullins posted on Facebook, on November 6, the suspected arsonist says, "Some areas are dealing with forest fires here in Eastern Kentucky also, but not here in the area where I am. Of course we hope not, but as you go through the day today you just want to be extremely careful if you're out there." Mullins offers, helpfully, that people should avoid smoke if possible. The "of course we hope not" is a common trope in meteorology that forecasters sometimes trot out when their area is threatened by bad weather.

Although credible meteorologists guard against "wishcasting," the reality is that bad weather is good for forecasters, because it magnifies attention on their work. Hurricanes, winter storms, and adverse events dramatically increase TV ratings and website clicks. Opportunistic arsonists have only increased the burden of widespread wildfires across the southeastern United States, which have been most concentrated this month in the southern Appalachia area.

Extremely dry conditions have led to the deployment of more than 5,000 firefighters to the region.

Tens of thousands of acres have burned.

How security flaws work: SQL injection

reader comments 69 Share this story A demonstration of SQL injection in action. Thirty-one-year-old Laurie Love is currently staring down the possibility of 99 years in prison.

After being extradited to the US recently, he stands accused of attacking systems belonging to the US government.

The attack was allegedly part of the #OpLastResort hack in 2013, which targeted the US Army, the US Federal Reserve, the FBI, NASA, and the Missile Defense Agency in retaliation over the tragic suicide of Aaron Swartz as the hacktivist infamously awaited trial. Love is accused of participating in the #OpLastResort initiative through SQL injection attacks, an increasingly common tactic.
SQL injections have recently been detected against state electoral boards, and these attacks are regularly implicated in thefts of financial info.

Today, they've become a significant and recurring problem. SQL injection attacks exist at the opposite end of the complexity spectrum from buffer overflows, the subject of our last in-depth security analysis. Rather than manipulating the low-level details of how processors call functions, SQL injection attacks are generally used against high-level languages like PHP and Java, along with the database libraries that applications in these languages use. Where buffer overflows require all sorts of knowledge about processors and assemblers, SQL injection requires nothing more than fiddling with a URL. As with buffer overflows, SQL injection flaws have a long history and continue to be widely used in real-world attacks.

But unlike buffer overflows, there's really no excuse for the continued prevalence of SQL injection attacks: the tools to robustly protect against them are widely known.

The problem is, many developers just don't bother to use them. One of Microsoft's less valuable innovations The earliest description of these attacks probably came in 1998, when security researcher Jeff Forristal, writing under the name "rain.forest.puppy," wrote about various features of Microsoft's IIS 3 and 4 Web servers in the hacker publication Phrack. IIS came with several extensions that provided ways to generate webpages based on data from databases.

Then and now, most databases use variants of a language called SQL (Structured Query Language) to manipulate their data.

Databases using SQL organize data into tables built up of rows and columns.

Each table represents a particular kind of thing.

Each column of the table represents a particular fact about that thing, and each row of the table is an instance of that thing.
So, for example, a table named "people" might have columns for "age" and "name," with each row in the table representing a specific person.
SQL is used for both defining these tables and columns and for manipulating the rows within them. IIS had several different ways of writing SQL commands—"queries"—to find and retrieve information in an SQL database.

The best-known and longest-lived of these is ASP (Active Server Pages), a system for writing webpages with embedded programming (usually using JavaScript or VBScript) that typically include substantial amounts of database access.

At the time, IIS also included something called IDC (Internet Database Connector) that was a less flexible way of sending an SQL query to a database and tabulating the results. Sometimes, the SQL query that these IDC and ASP files used for grabbing information from the database was hardcoded; that is, the same query was used every single time the webpage was loaded.

But because the query was often written to take one or more parameters, the data shown on that page could change as those parameters changed. For example, an online store might have a page named order.asp to display the contents and status of an order.

Each order is identified by an order ID, with the order ID passed as a parameter to the page as part of the URL: order.asp?orderID=1234, say, to view the order with ID 1234.

The ASP page would take the parameter from the URL and combine it into an SQL query, which was then sent to the database for lookup. What Forristal noticed was that the way these parameters were combined to build the query meant that an attacker could force the database to execute other queries of the attacker's choosing.

This act of subverting the application to run queries chosen by an attacker is called SQL injection. A popular tool to this day Since their initial discovery, SQL injection flaws have routinely been discovered in the wild and used to compromise vast quantities of data. While Forristal looked at Microsoft's software first, SQL injection was an industry-wide problem; sites using Java, PHP, ColdFusion, Ruby, and Python have all had SQL injection flaws.
Virtually every technology that can be used to build dynamic, database-driven websites is susceptible to SQL injection. And attacks using SQL injection are abundant.

Earlier this year, a Florida man was charged with felony hacking after using SQL injection to read sensitive data from an election site.
SQL injection vulnerabilities in the Joomla CMS and popular WordPress plugins have put hundreds of thousands of blogs at risk of attack. Security firm HBGary was devastatingly attacked in 2011 after members of the Anonymous collective discovered SQL injection flaws in a custom-developed content management system.

The group responsible for that attack would later go on to call itself Lulzsec. What makes these attacks particularly valuable to attackers—and devastating to victims—is their dual power to both read sensitive data from a database and to write new or updated data to the database. With this power to inject SQL, attackers could potentially read usernames, passwords, credit card numbers, Social Security numbers, or whatever else happened to be stored in the database. But they could also add their own data.
In a well-designed system, passwords, for example, won't be stored as plain text but instead will be irreversibly transformed into a hashed value. Merely reading the database therefore won't be sufficient to let an attacker log in and snoop around; while the database will contain the necessary username, it won't reveal the password.

The ability to write to the database, however, means that an attacker can simply create additional user accounts—ones for which they do know the passwords—and log in with those. Similarly, if the database is used to store any kind of auditing information, such as tracking logins and other user activity, the ability to write to it means that hackers can clean up after themselves, deleting any record of their attack. To understand how these attacks happen, we need to understand how Web applications use databases.

Basic SQL queries have a fairly simple structure.

For example, a call to search for a particular order might look like SELECT * FROM orders WHERE orderid=1234 The * means "retrieve all the columns," orders is the name of the table of data, and WHERE orderid=1234 restricts the data to only include the rows where the value of the orderid column is 1234.

The order ID number being searched for would typically be taken from parameter embedded in the URL.
So the 1234 from order.asp?orderID=1234 is combined with SELECT * FROM orders WHERE orderid= to produce the SQL query SELECT * FROM orders WHERE orderid=1234, and that query will be sent to the database. The fragment of code that does this might look something like: // first, pull the order ID from the URL query var orderid = Request.QueryString("orderid"); // then use the order ID to construct a query var query = "SELECT * FROM orders WHERE orderid = " + orderid; // and send the whole string to the database to execute But two things together make this problematic.

First, SQL databases allow multiple queries to be strung together one after the other.

To perform two searches, one could write: SELECT * FROM orders WHERE orderid=1234 SELECT * FROM orders WHERE orderid=5678 Different options and databases might change this a little—for example, by requiring a semicolon between the two queries—but the basic principle is the same. Second, URLs are not constrained to being numbers; they're just text strings.
So while the order ID should be a single number like 1234, it doesn't have to be.

An attacker could request, say, order.asp?orderID=1234 SELECT * FROM orders WHERE orderid=5678, putting an SQL fragment into the URL itself.
If the application does not take care to protect itself against SQL injection attacks, the query it constructs will include the attacker's code, and the database will run both queries together. The way the query string is built will often change the exact code the attacker has to write.
Integer parameters, as with the order ID, are typically the easiest to handle because they're typically not modified or altered in any way.
String parameters can introduce some additional complexity, because strings have to be wrapped in quote marks within the SQL query.

The code to do this might look something like: // first, pull the name from the URL query var customername = Request.QueryString("name"); // then use the name to construct a query var query = "SELECT * FROM customers WHERE customername = '" + customername + "'"; // and send the whole string to the database to execute In this situation, simply appending an SQL command to the URL won't work because of those single quote marks. Making a request to, say, customer.asp?name=SELECT * FROM orders would produce the following query: SELECT * FROM customers WHERE customername = 'SELECT * FROM orders' This won't reveal any data; it will just look for a customer with the rather implausible name of "SELECT * FROM orders".

But the attacker can solve this, prepending the attack query with a single quote mark, producing: SELECT * FROM customers WHERE customername = '' SELECT * FROM orders' That stray quote mark at the end of the query might cause the database to reject the query, so the usual solution is to comment it out, so that it is ignored, using -⁠-.

Accordingly, the URL would look like customer.asp?name=' SELECT * FROM orders -⁠⁠-, and the query would be: SELECT * FROM customers WHERE customername = '' SELECT * FROM orders --' Query in, data out Attackers can do a lot with the ability to simply run queries.

Destructive actions in particular, such as deleting data, don't really need much in the way of output.
Some databases let you run command-line programs from within queries, which again can be all hackers need to do their dirty work.

But generally, an attacker needs to read information as well as write it, both to retrieve valuable data stored in the database and to learn about the names of the database's tables and columns. Running a query of the attacker's choosing is one thing; seeing the results is another. When a query is run, the usual response from the database is a table of data.

Applications vary in the way they react to this.
Some applications will simply present the user with every bit of data returned in the table, making it very easy for an attacker to extract information from the system. Others will expect the result table to have particular columns and ignore any data not included in those columns.

This isn't an insurmountable problem for attackers, as they may be able to ensure that the data they want appears in one of those columns by altering the form of the query they write. Similarly, an application may iterate through every row of data returned by the query and show them all to the attacker.

Again, this represents the easiest situation for data exfiltration.

Alternatively, if the application expects only a single row to be returned, it may only show the first row that the injected query returns.

The application might even issue an error message if it's expecting only a single row to be returned by the database.

These situations make extracting data more complex, though again, it can be addressed by structuring the injected query just so. The other main response to a query is an error message, indicating that something went wrong.

Typically when injecting SQL, this will be an error from the database to complain that the query was malformed in some way.

Again, applications vary in how helpful they make this error message.

The best case for the attacker is when the developer is lazy and simply spits back the full error message to all and sundry.

This is often done during development, as it makes the develop/test/debug cycle quicker, but it shouldn't ever happen in real production systems. To take advantage of this, the attacker can ensure that every injected query results in an error that reveals some information.

For example, trying to convert a string into an integer will usually result in an error, and that error message will include the string that couldn't be converted.
So, to extract the database name, an attacker could run a query looking like this (in SQL Server): SELECT * FROM orders WHERE orderid=CONVERT(int, db_name()) db_name() here is an SQL Server function that returns the name of the current database.

CONVERT tries to convert that name into an integer.

This will fail, and it will produce an error message looking like: Conversion failed when converting the nvarchar value 'databasenamehere' to data type int. Thereby revealing the database's name to an attacker. Going in blind Better applications will show some generic error page, robbing the attacker of this useful information.

Even when the application is strict in what it displays—suppressing error messages, showing only columns that it's supposed to, and not showing multiple rows when only one is expected—attackers who can inject SQL might be able to piece together the information they need. One common technique for doing this is called "blind" SQL injection. In blind injection attacks, the attacker cannot write a query that simply displays the desired information.
Instead, the attacker constructs queries that extract one bit of data (literally, in the common case) at a time.

Consider again the query to show an order: SELECT * FROM orders WHERE orderid=1234 Presuming that order exists, this request will return exactly one row. We can add additional terms to the query and it will still return that one row: SELECT * FROM orders WHERE orderid=1234 AND 1=1 Conversely, we could make it return exactly zero rows to get some kind of an "order not found" page: SELECT * FROM orders WHERE orderid=1234 AND 1=0 Armed with this knowledge—we have a website that is returning one particular page for "true" and one particular page for "false"—we can start probing the database for information.

For example, perhaps we know that the victim is running Microsoft SQL Server, and we have an exploit of some kind that works against, say, SQL Server 2012. We therefore want to know if the system is running SQL Server 2012.

To do that in SQL Server, we'd run a query like: SELECT SERVERPROPERTY('ProductMajorVersion') = 11 This will return true if running on SQL Server 2012 but false otherwise.

This can be injected to produce a query such as: SELECT * FROM orders WHERE orderid=1234 AND SERVERPROPERTY('ProductMajorVersion') = 11 If running on SQL Server 2012, this will produce the normal order page. Otherwise, it will produce the error "no order found" page, thereby giving the attacker a single piece of information. More advanced techniques built on the same basic principle can allow several bits of information to be extracted at once. It will not be a great surprise to learn that extracting information bit by bit is slow and tedious.

Accordingly, a variety of tools exist to automate the process.

They can systematically send a barrage of requests to a site vulnerable to SQL injection in order to extract data from it.

Because this process is slow, blind injection won't be used to perform mass extraction.
Instead, it might be used to retrieve key pieces of information, such as administrative login credentials, and those credentials will be used to take further system action. String sanitization SQL injection is widely known and, because so much data in databases is sensitive, can be extraordinarily high-impact. Why do developers keep writing such bugs? Much discussion of SQL injection describes it as being substantially a problem of input validation.

The order ID in our example above, for example, should be an integer, not an arbitrary piece of text, and so we could fix the code to make it safe by forcing the order ID to be a number, ignoring any other characters.

For numeric data this is easy enough to enforce, as converting strings into integers is easy and robust. But for strings, which legitimately could contain any character under the sun, the process is a little harder. Or perhaps more accurately, it's a process that developers continue to screw up. The basic idea is one familiar to many developers: the special characters (such as quote marks) need to be transformed so that the query treats them as part of a string, rather than the end of a string.
In SQL Server, quote marks are escaped by doubling them.

A single quote mark, ', is used to denote the start or end of a string; two quote marks together, '', are treated as a literal quote mark when included as part of a string. MySQL uses backslashes for a similar purpose; to embed a single quote into a string, it must be written as \' (and to embed a backslash, use \\). This should, then, be straightforward for developers to handle: make sure any quote marks in the input are escaped, and the database will treat them correctly.

This can be done, and it does work if it's done consistently, with every single string properly cleaned up.

But if even one is forgotten, the security of the entire system can be undermined. Often, the "solutions" to this problem are inadequate at best. Many devs will have come across Web forms that prohibit characters such as the single quote from being used in usernames or passwords.

They might even go further and prohibit SQL-like words, such as SELECT, from being used.

These things are telltale signs that a developer is vaguely aware of SQL injection (and perhaps certain other problems, such as cross-site scripting) but has either opted not to make a robust fix or lacks confidence in the fix they have applied. Better yet: stop treating everything as a string The better solution requires changing how we think of SQL injection. The problem is the way that these applications add the parameters to their SQL queries.
In most programming languages, there's a hard distinction between the various special keywords and syntax of the programming languages, on one hand, and the data values that the program manipulates, on the other.

For example, many languages have a keyword goto to jump from one place in the code to another.

They also support text strings, and a text string could contain the text "goto".

These two things are separated; the program will never see the "goto" text string and try to jump to a different location.

Although the string happens to look like a language keyword, it is opaque; most languages don't try to treat data as if it were part of the program, so there is no risk that a string that reads "goto" will accidentally be interpreted as a goto keyword. (A few languages, such as shell scripting and tcl, blur this distinction in some ways, but few people use these on the Web.) SQL, too, maintains this distinction and does not confuse data as if it were commands. Most programs that work with databases, however, are not written in SQL directly.
Instead, they build their SQL queries as strings of text and send those strings to the database to process. By manipulating the queries as pieces of text, however, they create confusion between program commands and data.
String manipulation is done without concern for the syntax and grammar of SQL.
String manipulation doesn't distinguish between the query and the query's parameters, and so it doesn't correctly handle those troublesome quote marks.

A mere string doesn't know that some quote marks should be escaped, because they're meant to be parameters, and that others should not, because they're mean to be part of a query. Developers use strings to build queries because building strings is a basic part of almost all programming languages, and so it's a very simple, familiar way for developers to accomplish tasks. Most development languages don't have any integrated language support for SQL (though C# does have something along those lines), so using strings and string manipulation is the simplest way to get started with writing database-driven applications. Most beginners' tutorials will start with string manipulation, so the habit is inculcated pretty quickly, too.

At some point budding developers will get bitten by an SQL injection bug, so they'll try escaping strings, and that will be as good as it gets. But mainstream database APIs offer a better way of working with SQL—better both because it isn't prone to SQL injection attacks, and better because it can have slightly superior performance, too.

And unlike escaping of inputs, this alternative approach can't really be screwed up.
It doesn't rely on checking and validating every single parameter individually (leaving an application vulnerable if even one parameter should be neglected). That better way is through the use of prepared statements. Prepared statements provide separation between the query and the parameters the query uses.

The exact details vary slightly from language to language and API and API, but the basic idea is the same: instead of splicing together a bunch of strings to create the query, a prepared statement uses a query with placeholders used for each parameter.

The values that these parameters should take aren't populated by manipulating strings; they're assigned using the prepared statement API. For example, using C# and the ADO.NET API, the code would look something like this: // create the prepared statement object from a database connection SqlCommand command = connection.CreateCommand(); // define the query that the prepared statement uses, with @name being a placeholder for a parameter command.CommandText = "SELECT * FROM customers WHERE customername = @name"; // define the parameter as being a string type of any length command.Parameters.Add("@name", SqlDbType.VarChar, -1); // and set the parameter's value to the query string from the URL command.Parameters["@name"].Value = Request.QueryString("customername"); The API knows that the @name parameter should be a string (varchar is the SQL name for variable-length character sequences), so when the @name parameter has its value specified (in the last line), the API will always do the right thing.
It doesn't matter if attackers try to pass in quote marks, escape characters, or anything else as they try to trick the system into executing their SQL query; the system will always treat the parameter as string data. Using prepared statements like this rules out any chance of forgetting to escape data or escaping it improperly, because it entirely removes the need to escape data in the first place.

The query can still fail—an attempt to put string data into an integer parameter, for example, will still cause an error—but it will do so in a safe way that does not risk executing attacker-specified SQL queries. A problem that shouldn't still be with us, but is When looking at buffer overflows, one of the major themes was a never-ending series of escalations between defenders and attackers. New techniques, such as non-executable memory and address space layout randomization, have been introduced in operating systems to try to make buffer overflows easier to detect by developers and harder to exploit by attackers, but the core problem is caused primarily by the behavior of the C and C++ languages.
It will not go away until those languages are either abandoned or radically altered.

There's no perfect solution that's compatible with C and C++, so we have instead this range of mitigations and defenses. The same is not true of SQL injection. Prepared statements are a robust and virtually universal solution to the problem of SQL injection.

Every API worth using supports them, and yet SQL injection flaws remain in abundance.

Commercial software, open source software, custom-developed software—they're all afflicted. But with developers apparently unwilling to do things the right way, it's time for the creators of these frameworks to do the right thing and force all database access to go through prepared statements.

This won't make it impossible for wayward developers to use string manipulation and continue to create highly exploitable flaws, but it will make it much less attractive as the default approach to writing data-driven applications. Likewise, tutorials and training need to start doing things the right way from the outset.
Instead of creating bad habits and then trying to fix them, developers should be taught to do them properly from the outset. All this makes SQL injection a particularly frustrating flaw.

This is something that should be an historic artifact, a relic from a time before people understood the importance of writing secure applications, before widespread awareness of hackers and the techniques they use.

That it isn't shows that there's something not quite right about industry software development practices: we know how to do better, but we're not. Our credit cards, passwords, and other sensitive data are going to continue to be placed at needless risk as a result.

Building A Detection Strategy With The Right Metrics

The tools used in detecting intrusions can lead to an overwhelming number of alerts, but they're a vital part of security. Traditional security used to be focused exclusively on prevention and policy enforcement. While these mechanisms are still important, there is increasing attention paid to detection in order to be able to more rapidly spot potentially malicious activities that have circumvented preventative security. The leading security teams are shifting their mission to detect a potential attack and respond appropriately and quickly.

To achieve such a mission, organizations typically implement the following detection (or incident response) process: Start with an alert. Investigate the alert and determine if it is benign or unwanted. Make a proper decision and remediate or respond. This detection process can be implemented at different levels of detail in various security organizations.
I have seen implementations conducted by a single-person security team all the way to large security operations centers. Of course, an effective detection strategy isn’t easy to achieve with a limited security budget and constrained staff.

A major factor determining the effectiveness of the detection process is that security teams have to deal with the massive inefficiency and imprecision of security tools.

For most organizations, the total alert volume is far beyond what any group can handle. Most alerts aren’t useful because either they don’t highlight anything of importance or are nonspecific or unsubstantiated, making an already impossible situation even worse.

This situation is created by three main factors. First, most tools are still primarily malware-oriented and based on technical artifacts, such as a signature, hash, domain, or a predefined list of software behaviors.

This means that alerts will be firing all the time. Second, many tools are designed to detect isolated “trees” and don’t understand that they are part of an overall “forest” of an orchestrated campaign being run by an attacker.
It’s common to see a singular event — let’s say a scan inside the network — set off a hundreds or even thousands of individual alerts. Third, many traditional tools are based on a do-it-yourself mentality.

They’re created assuming a completely manual style of operational security. Part of this recognizes that most tools are rather low in fidelity and not intended to offload the work of a security professional by conducting some level of investigation or making a judgment call. If organizations are going to be successful with detection, they need to measure and improve their efficiency.

The following questions can help develop an effective detection strategy, or evaluate a specific new or existing detection tool: Detection coverage: If an active attacker were operating inside my network, would my systems see an operational activity and set off an alert? Detection quantity: Can my team investigate all the relevant alerts? Is it clear which are relevant? Detection quality: When an alert is investigated, can I reach a conclusion? Detection coverage is determined by what is being sought — is a tool looking for malware? Reconnaissance? Lateral movement? The best approach here is to define a few use cases that you’d like to detect and make sure that they’re covered. Most data breaches are ultimately the result of an attacker that operates inside the internal network.

The use cases can either be defined theoretically or tested live as an attack simulation or penetration test. Detection quality and quantity can be measured in a very concrete way using standardized metrics: Detection quantity: efficiency — Count the total number of alerts per day and calculate how many alerts you have per 1,000 hosts. Detection quality: accuracy — Percentage of alerts handled by the team that are useful (not whitelisted or ignored) over the total number of alerts. A Ponemon Institute study of 630 enterprises showed that the average number of alerts a security team receives per week from their security tools — firewalls, intrusion-detection systems, intrusion-prevention systems, sandboxing systems, security information and event management (SIEM) systems, etc. — was a shocking 172 per 1,000 hosts per day (2,420 alerts for an average 14,000 endpoints in a network). Where does one start with that volume of alerts, particularly given that the large majority of these are false-positives? The same study shows that, on average, two-thirds of the security staff’s time is wasted because their tools are grossly inefficient, so only 4% of alerts are typically investigated, leaving the possibility that somewhere around 96% of those untouched may be something important. This is clearly an inefficient, inaccurate process because there is no good way to guarantee a focus on the right alerts.

There are examples where signs of an attacker were hidden in a flood of alerts.
In the Target breach, according to reported details, signals were detected by existing systems but buried under thousands of similar alerts without any way to know that this particular alert was important. It’s not enough to have a haystack with some needles that can only be discovered with hindsight after it’s too late; we need a process that can get to the important needles to proactively detect attacks, before damage is done. Realistically, security teams can actively handle a handful of alerts per 1,000 hosts per day.

This is based on the Ponemon data (4% of handled events equals 6.8 alerts/1,000 hosts/day) and also on my experience with different chief information security officers.
If a process generates many more, they will be ignored. Using metrics enables evaluation and comparison of different tools and methods, including overlapping products.

For example, you can compare SIEM-based event correlation to network traffic analytics, sandbox-based detection, user and entity behavior analytics, endpoint detection, malware analysis, and so on, with these questions: Does the type of new alerts add significant coverage or attack detection capability? How many alerts per day does the system normally generate? Can my team investigate that amount? What percentage of the alerts can be handled and are relevant to handle? With this perspective, we can ask questions such as, “Will this increase detection accuracy?” “How is this security tool making the team more efficient?” and “Can this security solution reduce staffing requirements?” Related Content: Giora Engel, vice president, product & strategy at LightCyber is a serial entrepreneur with many years of technological and managerial experience.

For nearly a decade, he served as an officer in an elite technological unit in the Israel Defense Forces, where he initiated and ...
View Full Bio More Insights