Optymalizacja zapytania

0

Hej, mam problem z optymalizacją zapytania, struktura wygląda mniej więcej tak:
jest tabela z firmami oraz z produktami połączone wiele do wielu w tabeli łączącej dodatkowo przechowywana jest ilość produktu oraz data wygaśnięcia licencji produktu - mam zapytanie które pobiera firmy wraz z datą zakończenie licencji najstarszego produktu wygląda to tak:


SELECT 
company.id, 
company.name, 
(
    SELECT product_brand.name FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC limit 1
) as brand_name,
(
    SELECT product_brand.id FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC limit 1
) as brand_id,
(SELECT MIN(`date`) FROM company_products WHERE company = company.id ) as date_end, 
(SELECT SUM(quantity) FROM company_products WHERE company = company.id) as quantity, 
company.phone, 
company.zip, 
company.city, 
state.id as state_id, 
state.name as state_name, 
company.street,
user.id as user_id, 
user.name as user_name, 
user.surname as user_surname, 
company_group.id as company_group_id, 
company_group.name as company_group_name
FROM `company` 
LEFT JOIN company_group ON company.company_group = company_group.id 
LEFT JOIN user ON company.supervisor = user.id 
LEFT JOIN state ON company.state = state.id

Zapytanie działa bardzo szybko koło 0.002s problem pojawia się przy orderach np order by (SELECT MIN(`date`) FROM company_products WHERE company = company.id ) trwa już 0.4s jak przyspieszyć takie zapytanie ? Czy dobrym pomysłem było by trzymanie date i ilości w tabeli firm i update tych wartości wraz ze zmianami w tabli z produktami?

1

Co, chwila - wrzuć może schemat tej bazy oraz jakieś przykładowe dane, bo ciężko stwierdzić, co Twoje zapytanie w ogóle robi.

3

order by select??? - co to za potwór?

Ile dostajesz rekordów w wyniku? Zamiast order by select zrób order by x, gdzie x to numer, zaczynając od 1, kolumny po której chcesz posortować

0

Fragment o którym mowa:
schema.png
Chodzi o to żeby pobrać firmy wraz z datą najstarszego produktu oraz nazwą firmy tego produktu.
@abrakadaber Przy order by 'numer kolumny' jest w ogóle tragicznie źle - koło 3.2s
Ilości danych: company - koło 100k, company_products 30k, product 400, i product_brand 114,

1
Czarny Karp napisał(a):

SELECT 
company.id, 
company.name, 
(
    SELECT product_brand.name FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC limit 1
) as brand_name,
(
    SELECT product_brand.id FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC limit 1
) as brand_id,
(SELECT MIN(`date`) FROM company_products WHERE company = company.id ) as date_end, 
(SELECT SUM(quantity) FROM company_products WHERE company = company.id) as quantity, 
company.phone, 
company.zip, 
company.city, 
state.id as state_id, 
state.name as state_name, 
company.street,
user.id as user_id, 
user.name as user_name, 
user.surname as user_surname, 
company_group.id as company_group_id, 
company_group.name as company_group_name
FROM `company` 
LEFT JOIN company_group ON company.company_group = company_group.id 
LEFT JOIN user ON company.supervisor = user.id 
LEFT JOIN state ON company.state = state.id

Zamień na:


SELECT 
company.id, 
company.name, 
(
    SELECT product_brand.name FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC limit 1
) as brand_name,
(
    SELECT product_brand.id FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC limit 1
) as brand_id,
x.date_end,
x.quantity, 
company.phone, 
company.zip, 
company.city, 
state.id as state_id, 
state.name as state_name, 
company.street,
user.id as user_id, 
user.name as user_name, 
user.surname as user_surname, 
company_group.id as company_group_id, 
company_group.name as company_group_name
FROM `company`
LEFT JOIN (SELECT company, MIN(`date`) date_end, SUM(quantity) quantity FROM company_products GROUP BY company ) x ON x.company = company.id    
LEFT JOIN company_group ON company.company_group = company_group.id 
LEFT JOIN user ON company.supervisor = user.id 
LEFT JOIN state ON company.state = state.id


I sprawdź wydajność. Możesz dorzucić również indeks na tabeli company_products na (company, date) i znowu sprawddzić.

0

Niestety ta wersja jest dużo wolniejsza, oryginalne zapytanie bez orderów trwa 0.0020 po zmianach 0.0573 po dodaniu order czas to 2.3s dla nowego zapytania

0

Możesz spróbować czegoś takiego:


CREATE TEMPORARY TABLE tmp_table AS (
    SELECT company.id as company_id
    , product_brand.id as brand_id
    , product_brand.name as brand_name
    , `date`
    FROM company_products
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
);

CREATE TEMPORARY TABLE tmp_oldes_product AS (
    SELECT t1.company_id
    , t1.brand_id
    , t1.brand_name
    , t2.date as date_end
    FROM tmp_table t1
    INNER JOIN  (
        select company_id
        from tmp_table 
        group by id
    ) t2 ON (t1.company_id = t2.company_id and t1.date = t2.min_date)
);

SELECT
company.id,
company.name,
tmp_oldes_product.brand_name,
tmp_oldes_product.brand_id,
company.phone,
company.zip,
company.city,
state.id AS state_id,
state.name AS state_name,
company.street,
USER.id AS user_id,
USER.name AS user_name,
USER.surname AS user_surname,
company_group.id AS company_group_id,
company_group.name AS company_group_name,
agg_company_products.date_end,
agg_company_products.quantity
FROM `company`
LEFT JOIN company_group ON company.company_group = company_group.id
LEFT JOIN USER ON company.supervisor = USER.id
LEFT JOIN state ON company.state = state.id
LEFT JOIN tmp_oldes_product ON company.id = tmp_oldes_product.company_id
LEFT JOIN (
    SELECT company
    , MIN(`date`) as date_end
    , SUM(company_products.quantity) as quantity
    FROM company_products
    GROUP BY company
) agg_company_products ON company.id = agg_company_products.company;

EDIT nie dam głowy, że się wykona, możliwe, że będziesz musiał nanieść drobne poprawki.

0

To nie będzie optymalizacja, a raczej dobra praktyka, unikaj podzapytań one są niepotrzebne, analizując to co masz to dla każdego wiersza 4 podzapytania z czego:

(
    SELECT product_brand.name FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC LIMIT 1
) AS brand_name,
(
    SELECT product_brand.id FROM company_products 
    LEFT JOIN product ON company_products.product = product.id
    LEFT JOIN product_brand ON product.brand_id = product_brand.id
    WHERE  company = company.id ORDER BY `date` ASC LIMIT 1
) AS brand_id

służa do pobrania 2 pól z tego samego rekordu!

Natomiast te:

(SELECT MIN(`date`) FROM company_products WHERE company = company.id ) AS date_end, 
(SELECT SUM(quantity) FROM company_products WHERE company = company.id) AS quantity, 

Służą do pobranie 2 zagregowanych wartości z tej samej tabeli, nielogiczne wydaje mi się że minimalna data jest datą końcową, tu bardziej pasuje max, ale nie wiem na jakich danych operujesz...

To teraz pozbądź się podzapytań, pierwsze np. tak (nie wiem czy to jest najlepsze, ale na codzień nie pracuje w mysql):

SELECT 
	@row_number:=CASE WHEN @company=company THEN @row_number+1 ELSE 1 END AS r
	,@company:=company AS company
	,product_brand.name  as brand_name
	,product_brand.id  as brand_id
FROM 
	company_products 
	LEFT JOIN product ON company_products.product = product.id
	LEFT JOIN product_brand ON product.brand_id = product_brand.id
order by
	company
	,`date`

drugie 2 podzapytania:

SELECT 
	company
	,MIN(`date`) as date_end
	,SUM(quantity) AS quantity
FROM 
	company_products 
group by 
	company

Teraz połącz w całość:

SELECT 
	company.id, 
	company.name, 
	brand_name,
	brand_id,
	date_end, 
	quantity, 
	company.phone, 
	company.zip, 
	company.city, 
	state.id AS state_id, 
	state.name AS state_name, 
	company.street,
	USER.id AS user_id, 
	USER.name AS user_name, 
	USER.surname AS user_surname, 
	company_group.id AS company_group_id, 
	company_group.name AS company_group_name
FROM 
	`company` 
	LEFT JOIN company_group ON company.company_group = company_group.id 
	LEFT JOIN USER ON company.supervisor = USER.id 
	LEFT JOIN state ON company.state = state.id
	LEFT JOIN (SELECT 
					@row_number:=CASE WHEN @company=company THEN @row_number+1 ELSE 1 END AS r
					,@company:=company AS company
					,product_brand.name as brand_name
					,product_brand.id as brand_id
				FROM 
					company_products 
					LEFT JOIN product ON company_products.product = product.id
					LEFT JOIN product_brand ON product.brand_id = product_brand.id
				order by
					company
					,`date`) brand on brand.company=company.company_id and brand.r=1
	LEFT JOIN (SELECT 
				company
				,MIN(`date`) as date_end
				,SUM(quantity) AS quantity
			FROM 
				company_products 
			group by 
				company) S ON S.company=company.company_id 

I porównaj czy jest lepiej...

0

Hej @Panczo niestety to zapytanie nie jest do końca poprawne - dla takiego zapytania każda z firm będzie pojawiać się tyle razy ile ma produktów - nie o taki efekt chodzi, firma ma być raz i wiersz ma zawierać najstarszą datę z dostępnych dla danej firmy produktów, napisałem przed chwilą jeszcze coś takiego:

SELECT 
company.id, 
company.name, 
brand.name as brand_name,
brand.id as brand_id,
c.date_end,
c.quantity,
company.phone, 
company.zip, 
company.city, 
state.id as state_id, 
state.name as state_name, 
company.street,
user.id as user_id, 
user.name as user_name, 
user.surname as user_surname, 
company_group.id as company_group_id, 
company_group.name as company_group_name
FROM `company` 
LEFT JOIN company_group ON company.company_group = company_group.id 
LEFT JOIN user ON company.supervisor = user.id 
LEFT JOIN state ON company.state = state.id
LEFT JOIN (
SELECT MIN(`date`) as date_end, SUM(quantity) as quantity, company FROM company_products group by company
) c ON c.company = company.id
LEFT JOIN (
  SELECT 
  company_products.company,
  product_brand.id,
  product_brand.name
  FROM company_products 
  LEFT JOIN product on company_products.product = product.id
  LEFT JOIN product_brand ON product.brand_id = product_brand.id
  JOIN (
  SELECT MIN(`date`) AS date_end, company FROM company_products GROUP BY company
  ) as date_table on date_table.date_end = `date` and company_products.company = date_table.company
) brand  ON brand.company = company.id

Ale nadal nie jest to demon szybkości, 0.23 s - to dalej za dużo - może problem leży po stronie konfiguracji bazy a nie samego zapytania?

0

Twoje podzapytanie brand jest inne niż moje, zwróć uwagę na warunek sprzężenia:

ON brand.company=company.company_id AND brand.r=1

dokładnie na to brand.r=1 to zapobiegnie powtarzaniu się danych

0

@Panczo: Testował twoje zapytanie i niestety zwraca wiele firm zamiast jednego tak jak pisałem wczesniej

0

nie bardzo wiem co ma wpływać na powielenie danych, ale to już by trzeba na danych przetestować

0
SELECT 
company.id, 
company.name, 
pb.name AS brand_name,
p.brand_id AS brand_id,
cp.date AS date_end, 
(SELECT SUM(quantity) FROM company_products WHERE company = company.id) AS quantity, 
company.phone, 
company.zip, 
company.city, 
state.id AS state_id, 
state.name AS state_name, 
company.street,
USER.id AS user_id, 
USER.name AS user_name, 
USER.surname AS user_surname, 
company_group.id AS company_group_id, 
company_group.name AS company_group_name
FROM `company` 
left join (select cpp.company, min(cpp.product) product, cpp.date from company_products cpp inner join (select company, min(date) date from company_products group by company) cpd on (cpp.company=cpd.company and cpp.date=cpd.date) group by cpp.company, cpp.date) cp on (company.id=cp.company)
left join product p on (cp.product=p.id) 
left join product_brand pb on (p.brand_id=pb.id)
LEFT JOIN company_group ON company.company_group = company_group.id 
LEFT JOIN USER ON company.supervisor = USER.id 
LEFT JOIN state ON company.state = state.id
order by cp.date
0

Niestety to zapytanie jest zbyt wolne - najszybsze jeśli chodzi o listowanie wraz z orderami i wyszukiwania to :

SELECT 
company.id, 
company.name, 
brand.name as brand_name,
brand.id as brand_id,
c.date_end,
c.quantity,
company.phone, 
company.zip, 
company.city, 
state.id as state_id, 
state.name as state_name, 
company.street,
user.id as user_id, 
user.name as user_name, 
user.surname as user_surname, 
company_group.id as company_group_id, 
company_group.name as company_group_name
FROM `company` 
LEFT JOIN company_group ON company.company_group = company_group.id 
LEFT JOIN user ON company.supervisor = user.id 
LEFT JOIN state ON company.state = state.id
LEFT JOIN (
SELECT MIN(`date`) as date_end, SUM(quantity) as quantity, company FROM company_products group by company
) c ON c.company = company.id
LEFT JOIN (
  SELECT 
  company_products.company,
  product_brand.id,
  product_brand.name
  FROM company_products 
  LEFT JOIN product on company_products.product = product.id
  LEFT JOIN product_brand ON product.brand_id = product_brand.id
  JOIN (
  SELECT MIN(`date`) AS date_end, company FROM company_products GROUP BY company
  ) as date_table on date_table.date_end = `date` and company_products.company = date_table.company
) brand  ON brand.company = company.id

Wykonuje się koło 0.3s zastanawiam się czy jest opcja przyspiwszenia go bardziej czy to już kwestia zasobów sprzętowych lub konfiguracji - jeśli sprzęt lub config to czy ktoś ma jakieś rady?

1 użytkowników online, w tym zalogowanych: 0, gości: 1