#TSQL2sday #149: T-SQL Advice you’d give to your younger self

Hello! As I mentioned on my invitation post, this month I’m hosting the T-SQLTuesday and my theme was “T-SQL Advice you’d give to your younger self”. Here’s my contribution.

If I could go back in time, and maybe send myself an anonymous email (which would probably go to my spam box? hmmm), this is what I’d like to find:

  • It’s ok to use Google, like every day multiple times a day. Sometimes we face issues and we may tend to get super deep into it, but forget that we do not need to re-invent stuff. Others have likely been through what you are living and if they were great human beings they even talked about it on the internet. Isn’t that wild? We must learn to be humble. It’s OK to have your colleagues and managers see an open Google tab, as long as you are delivering the job you’ve been given, you’ll be just fine.
  • Comment your code! I’ll give you a week and you’ll forget what you were doing anyways so leave some useful comments. That will help you and others who will pick up on your work after you. I like to add the following to all my scripts: author, date, project name and a little description. Sometimes I even add the name of the person who requested the work, makes it easier to talk directly whit whom I need to.
  • Do you know those times when you find yourself repeating the same piece of code over and over? There are better ways to version your code, but as a starting point, you can create a folder on your documents and start adding sql files there that you can refer to later. Here’s what I have on my folder:
    • how to search for a column or table with a specific name;
    • how to use date functions with examples such as DATEADD, DATEDIFF;
    • how to check for temp tables column names;
    • how to format phones when they’re all messed up;
    • how to find databases, filenames and paths;
    • getting a path for sql server log error message;
    • check you database size;
    • how to check for used and used space on the database
    • and whatever else your heart desires! This is the type of work that I do and that later will save my time. I’ll probably make more posts to show my solutions to this handy scripts 🙂
  • Always ident your code and use Uppercase for all commands. You’ll notice that one day you won’t even trust people who don’t do that (oh, the drama).
  • If you have someone senior than you at work, ask them questions when you’re stuck, they probably will solve the problem faster, more elegantly and you’ll learn how to do it by watching them!
  • Make it a habit of NOT using “*” when you are doing a SELECT query. It’s ok to do it once so you can see the table, but then your real query should list the fields you want to see, by name. ✹ Marie Kondo your query and let go of what does not serve you anymore ✹
  • Don’t wait until you are an accidental DBA to learn the fundamentals. Even if you just want to be able to discuss problems with them.
  • Learn how to read the query execution plan. It will help with understanding bottlenecks and how to optimize your queries.
    • …also, when optimizing, don’t struggle about the milliseconds you might save. Nobody will notice. You want to focus on the changes that will give you the biggest differences on execution time. Think about how big the impact will be by projecting “if I don’t fix this will someone come to my table with a very angry face?”
  • Whenever you start writing an UPDATE or DELETE statement, immediately write FROM… WHERE… This will save you the headaches of forgetting to finish your command and updating or deleting everything at once.

That’s it for me. Hope this was helpful for some beginners out there. Now tell me…

What do you wish you knew earlier? 🙂

#TSQL2sday #149 Invitation: Blog about T-SQL Advice you’d give to your younger self

T-SQL Tuesday is a monthly blogothon where a host chooses one topic and bloggers around the world give their ideas about it. This month is my first time being the host and I’m so excited!

April’s theme: if you could give advice about T-SQL to your younger self, what would you say? I’m not defining any more specific sub-topics here, I’m opening the floor to hear ideas about your past experiences and what you wish you knew better back then.

Some rules:

  • You have to publish your post next Tuesday, on April 12th.
  • Include the T-SQL Tuesday logo, and if somebody clicks on the logo, it has to link back to this post.
  • Include a link to this blog post, or after you publish it, leave a comment here so I can track you down for the roundup blog post next week. If you want to share via Linkedin as well or other social media, please do and I’ll link you too.
  • Isn’t this idea awesome? Do you want to join? For future hosts, please reach out to Steve Jones via Twitter on @way0utwest. Just a heads up that there is a line of people who also want to host, so if you’re uncertain, do it anyway because you have time to prep!
  • Why host? Well, as a blogger, this is a wonderful way to ramp up your visualizations and to get another bloggers to know your work. Plus, it’s one way to give back to the community. Check out how great are the posts by looking into the previous topics.

I hope this is going to be fun and I’m definitely looking forward to reading your thoughts on my theme. Let’s get writing!

my very advanced drawing skills, as always

T-SQL Tuesday – My (least) favorite SQL data type: DATE

Hello! This post is a contribution to T-SQL Tuesday. T-SQL Tuesday is a monthly blogothon where we get together and write about a different topic. March’s topic was to blog about you favorite data type, hosted by Brent Ozar.

I wrote two versions of this post. This one you’re reading in English, and this one in Portuguese.

If you work with data, you probably do not have control over all the data sources you need. What I mean, is that for example, you may receive data from different places. Perhaps it’s your job to centralize and standardize it the best you can, so that it makes sense to your team. Once you understand the data, value can be extracted from it.

When your data comes from different systems, it’s likely you will not have control over the data’s validation. For example, one of you vendors may be very specific about the data types allowed into their systems, which means when the data gets to you, you’ll see something (ideally) more structured. However, your data may also be handled by a group of fed up developers who decided they will allow everything the user wants (special attention to the the verb there being want instead of need, big difference).

Date data types are really important and used in SQL. But for humans, dates can be formatted in some ways… For example, I’m from Brazil, and the way we write our dates is different from the US.

  • Brazil: DD/MM/YYYY

In SQL, your date type stores data like this: YYYY-MM-DD. No room for mistakes, right? Bam, Wrong.

Remember when I said you may have different data sources and you can’t control their data type validations? Let’s think of the following example:

  • Your clients use a 3rd part system
  • The 3rd party uses the data to do whatever it is their system does
  • They send you that data with the results of your project
  • You, a smart data person, tries to load the data into your system. Most importantly, your system is formatted with the data types you expect to receive. So, for example, if you expect a field with a date value, you’ll format your table to have a date type column.
  • Let’s use the table “ThirdPartyInfo” as an example.

Now, as should have assumed by now, your third party did not applied any data validation to the date types. Hence, you may get some crazy “dates”, like this:

  • Jan/2021
  • 03-20-18
  • 01-02-03 (where to even begin with this one?!)
  • 2022/2
  • and many others….

Here’s what happens when you try to insert something that’s not a date, into any of the date columns:

It does not matter the method you’ll use to input data to your table, you’ll get an error if you’re not passing date values to your date types columns.

How can you avoid issues like this

  • Be open to your vendor about why this is important to your data, and explain to them your tables data type. How? Documenting, the thing IT people hate most.
  • If the vendor is pushing back, talk to your superiors, and show them a scenario in which you need to spend your precious (expensive) time to fix this mishap. Enforce this could be avoided if everyone were on the page about the data types for the data you share.
  • If nothing above works, or you need a temporary solution, you could validate the data on your end too. More work upfront, but your future self will thank you for putting this effort now.

Learning it the hard way

  • Real life example: I recently had an issue that cost me a lot of time. I had received a csv I needed for reporting, and the file had a few date fields. My table, was expecting date types to come in all the fields, but I was getting errors on my data loading job.
    • I had to take a step back and find where the issue was. I thought it would be easy to find, by checking the most recent info that got into the system (this was a daily file we received), and so I started looking for the issue. My main mistake, was that I did not isolate the date column that was giving me an error. In the SQL Server, the error message was really vague, I could only tell there was a string trying to be converted to date and I had several different date fields in my table.
    • My second big mistake was fixing everything I found on the source file, and I thought was wrong and causing the issues. Spoiler alert: it wasn’t.
    • It took me some time to realize that “ISDATE” was an easy way to search for a column that is expecting a date type, but received something else instead:
In desperate times, we may be blind by stress and not think about simple things, like this.

You see that the return to that query has the string value. Now, you could apply this kind of check as a validation before you load data into your production table, and also use it to troubleshoot issues like mine.

With all that being said, I actually like the DATE data type in SQL. It works great, the real issue is that we as humanity never agreed on a single date format. Sigh. I hope this helps!

If you want you can read more about the Date types here: Date types in SQL

Would you solve my problem in a different way? Tell me how below in the comments (:

T-SQL Tuesday – Meu tipo de dado (menos) favorito em SQL: DATE

OlĂĄ! Este post Ă© uma contribuição ao T-SQL Tuesday. T-SQL Tuesday Ă© um blogothon mensal, onde a comunidade se reĂșne para escrever sobre um tĂłpico diferente. O tĂłpico de março  Ă© sobre seu tipo de dados favorito. Brent Ozar Ă© o host do mĂȘs.

Eu escrevi esse post em duas versĂ”es. Esta que vocĂȘ estĂĄ lendo em portuguĂȘs, e uma em inglĂȘs.

Se vocĂȘ trabalha com dados, vocĂȘ provavelmente nĂŁo tem controle sobre todas suas fontes. Por exemplo, vocĂȘ pode coletar dados de lugares diferentes. Talvez seja o seu trabalho centralizar os dados, e criar padrĂ”es para que os dados façam sentido para o seu time. Uma vez que vocĂȘ entendeu seus dados, vocĂȘ pode extrair o real valor deles.

Quando seus dados existem em diferentes sistemas, talvez vocĂȘ nĂŁo tenha controle sobre a validação que acontece por trĂĄs. 

Por exemplo, um dos seus fornecedores podem ser muito especĂ­ficos sobre os tipos de dados que eles permitem no sistema. Isso significa que quando os dados chegarem atĂ© vocĂȘ, vocĂȘ verĂĄ algo (idealmente) mais estruturado. Contudo, seus dados podem tambĂ©m prover de um sistema liderado por desenvolvedores que decidiram deixar o usuĂĄrio “livre” para fazer o que quiser (atenção ao verbo “querer” e nĂŁo “precisar”, existe uma grande diferença aĂ­).

Dados dos tipos de datas são muito importantes em SQL. Mas para nós humanos, datas podem ser formatadas de diferentes maneiras
 Por exemplo, no Brasil, escrevemos datas de um jeito diferente dos Estados Unidos.

  • Brasil: DD/MM/AAAA

Em SQL, seu tipo de data Ă© guardado no banco assim: AAAA-MM-DD. NĂŁo tem como errar, certo? Errado.

Lembra quando eu disse que vocĂȘ pode ter origens diferentes dos dados e que por isso nĂŁo tem controle sobre as validaçÔes? Vamos pensar no seguinte exemplo:

  • Seus clientes usam um sistema de outro fornecedor, passando os dados para eles
  • O fornecedor usa os dados para fazer o que quer que seja que o sistema faça
  • Eles te enviam os dados com o resultado do projeto
  • VocĂȘ, um profissional inteligente, tenta carregar os dados no seu sistema. E mais importante, seu sistema Ă© formatado com os tipos de dados que vocĂȘ espera receber. EntĂŁo, digamos que vocĂȘ queira receber um campo com valores do tipo data, vocĂȘ vai formatar sua tabela para ter um tipo de data.
  • Vamos usar a tabela “ThirdPartyInfo” abaixo como exemplo.

Agora, como vocĂȘ jĂĄ deve ter imaginado, seu fornecedor nĂŁo aplicou nenhum tipo de validação para esses tipos de dados. Logo, vocĂȘ pode se deparar com alguns tipos de datas estranhos, como esses:

  • Jan/2021
  • 03-20-18
  • 01-02-03 (onde começar com esse?!)
  • 2022/2
  • Entre outros…

Isso Ă© o que acontece quando tentamos inserir algo que nĂŁo Ă© uma data, em qualquer uma das colunas do tipo data.

NĂŁo importa o mĂ©todo que vocĂȘ use para popularizar essa tabela, vocĂȘ vai receber um erro se vocĂȘ nĂŁo estĂĄ passando valores do tipo data para suas colunas do tipo data.

Como evitar problemas como esse

  • Seja honesto com o seu fornecedor sobre a razĂŁo de ter os tipos de dados que vocĂȘ precisa, e explique quais sĂŁo eles. Como explicar? Fazendo aquilo que os profissionais de TI mais odeiam: documentando.
  • Se o fornecedor estĂĄ relutante com a mudança, fale com seus superiores, mostre para eles cenĂĄrios onde vocĂȘ tem que gastar seu tempo precioso (e caro) sĂł para arrumar esse erro. Reinforce que isso pode ser evitado se todo mundo estivesse na mesma pĂĄgina sobre os tipos dos dados que vocĂȘs compartilham.
  • Se nada acima funcionar, ou vocĂȘ precisar de uma solução temporĂĄria enquanto a situação se resolve, vocĂȘ poderia validar os dados do seu lado. Se vocĂȘ fizer um esforço agora, antes do problema, pode parecer mais trabalho, massss o vocĂȘ do futuro vai te agradecer por ter se esforçado antes do problema chegar.

Aprendendo do pior jeito

  • Exemplo da vida real: eu recentemente tive um problema que me custou um certo tempo. Eu recebi um arquivo csv necessĂĄrio para um relatĂłrio, e o arquivo tinha alguns campos de datas. Minha tabela estava esperando receber os campos de datas com valores
de datas. PorĂ©m, meu job do carregamento dos dados estava com problemas.
    • Eu tive que dar um passo para trĂĄs e tentar encontrar a raiz do problema. Eu pensei que fosse ser algo fĂĄcil de conseguir se eu olhasse as entradas mais recentes do arquivo (esse tipo de arquivo era mandado pra nĂłs todos os dias, com as coisas mais recentes atualizadas). Meu problema principal, foi que eu nĂŁo isolei a coluna de data que estava dando erro. Na verdade, no SQL Server, a mensagem de erro quase sempre Ă© muito genĂ©rica, e nĂŁo me falava qual a coluna. Tudo que eu sabia era que uma string estava tentando ser convertida em data, mas sem sucesso.
    • Meu segundo maior problema foi arrumar tudo que eu encontrei no arquivo, que pudesse de alguma forma estar causando o problema. Alerta spoiler: isso nĂŁo resolveu meu problema.
    • Eu demorei um tempo para entender que usar a função “ISDATE”, seria uma maneira fĂĄcil de procurar por uma coluna que esperava receber um tipo de data, mas que recebia outra coisa.

Em momentos de desespero, esquecemos soluçÔes simples, como esta. 

Observe que o resultado dessa consulta tem um valor do tipo texto. Agora, vocĂȘ pode aplicar essa validação ANTES de carregar os dados na sua tabela de produção, alĂ©m de usar o mesmo comando para investigar problemas como esse.

Resumindo: eu na verdade gosto do tipo DATE no SQL. Funciona bem, mas o problema real foi que falhamos como humanidade pois nunca concordamos em um Ășnico formato para datas. Espero te ajudar de alguma forma!

Se quiser ler mais sobre tipos de dados de datas em SQL, leia aqui (em inglĂȘs). 

VocĂȘ resolveria meu problema de uma maneira diferente? Me conte abaixo nos comentĂĄrios (: